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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 



2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g. , cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (/. e. , partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

15 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 



3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SB H), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1 350. The polypeptides sequences are designated SEQ 
ID NO: 1 35 1-2700. The nucleic acids and polypeptides are provided in the Sequence Listing. In 
the nucleic acids provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is 
mymine; and N is any of the four bases. In the amino acids provided in the Sequence Listing, * 

1 0 corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1 -1350 under stringent hybridization conditions; 
nucleic acid sequences which are allelic variants or species homologues of any of the nucleic acid 
sequences recited above, or nucleic acid sequences that encode a peptide comprising a specific 

1 5 domain or truncation of the peptides encoded by SEQ ID NO : 1 - 1 350 . A polynucleotide 

comprising a nucleotide sequence having at least 90% identity to an identify ing sequence of SEQ 
ID NO: 1 -1350 or a degenerate variant or fragment thereof. The identifying sequence can be 1 00 
base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ IDNO: 1-1350. The sequence information can be a 
segment of any one of SEQ ID NO: 1 - 1 350 that uniquely identifies or represents the sequence 
information of SEQ ID NO: 1 - 1 3 50. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic acid 
30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 

2 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: I -1 350 or novel 
segments or parts of the nucleic acids of the invention are used as primers in expression assays that 
5 are well known in the art In a particularly preferred embodiment, the nucleic acid sequences of 
SEQ ID NO: 1 - 1 3 50 or novel segments or parts of the nucleic acids provided herein are used in 
diagnostics for identifying expressed genes or, as well known in the art and exemplified by Vollrath 
et ah, Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human 
genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -1 350; a 
polynucleotide comprising any of the full length protein coding sequences of SEQ ID NO: 1 - 1350; 
and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding 
sequences of SEQ ID NO: 1- 1350. The polynucleotides of the present invention also include, but 

15 are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) 
the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1-1 350; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in the Sequence Listing 
{e.g. , SEQ ID NO: 1351-2700); (c) a polynucleotide which is an allelic variant of any 
polynucleotides recited above; (d) a polynucleotide which encodes a species homolog (e.g. 

20 orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of any of the polypeptides comprising an amino acid 
sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the cenresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with 

biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence 
set forth in SEQ ED NO: 1-1 350; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g , with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 

3 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RNA, their chemical analogs and the like. For example, when the expression of an rnRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue rnRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al. s Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical ly acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

35 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, fc 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and fonr 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 

20 (i. e. } increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compound: 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with {e.g. , 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 

25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a 
polypeptide/compound complex, wherein the complex drives expression of a reporter gene 
sequence in the cell; and detecting the complex by detecting the reporter gene sequence 
expression such that if expression of the reporter gene is detected the compound the binds to a 

30 polypeptide of the invention is identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods foT treating diseases o 
disorders as recited herein comprising administering compounds and other substances that 

35 modulate the overall activity of the target gene products. Compounds and other substances can 

5 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 polynucleotides to which they have homology (set forth in Table 2). If no homology is set forth 
for a sequence, then the polypeptides and polynucleotides of the present invention are useful for 
a variety of applications, as described herein, including use in arrays for detection. 



1 0 4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

15 The term "active" refers to those forms of the polypeptide which retain the biologic 

and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 

20 natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells'* as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3*-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 

30 complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 

35 and continuous source of germ cells for the production of gametes. The term "primordial germ 
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cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
5 not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

10 sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 
The terms "nucleotide sequence" or "nucleic acid*' or "polynucleotide" or 

1 5 "oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single- stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

20 (U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 

25 regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

30 most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 1 00 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 

35 nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 

7 
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be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
5 IDNOs:l-1350. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1:241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
10 art. Probes of the present invention, their preparation and/or labeling are elaborated in 

Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, P.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

15 The nucleic acid sequences of the present invention also include the sequence 

information from the nucleic acid sequences of SEQ ID NO: 1-1 350. The sequence information 
can be a segment of any one of SEQ ID NO: 1-1350 that uniquely identifies or represents the 
sequence information of that sequence of SEQ ID NO: 1-13 50. One such segment can be a 
twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in 

20 the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set 
of chromosomes. Because 4 20 possible twenty-mers exist, there are 300 times more twenty-mers 
than there are base pairs in a set of human chromosomes. Using the same analysis, the 
probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 
5. When these segments are used in arrays for expression studies, fifteen-mer segments can be 

25 used. The probability that the fifteen-mer is fully matched in the expressed sequences is also 

approximately one in five because expressed sequences comprise less than approximately 5% of 
the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

3 0 with a single mismatch is calculated by multiplying the probability for a full match ( 1 -4-4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certaui genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological anoVor immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylate phosphorylation, lipidation and acylation, 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

■ The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 



9 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 

10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

15 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

10 

Printed from Mimosa 03/03/06 11:07:47 Page: 11 



WO 01/57188 



PCT/U SO 1/03800 



can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in it's natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. , microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is 
expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted 
wholly {e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are 

20 expressed. "Secreted" proteins also include without limitation proteins that are transported 
across the membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to 
include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, 
P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells 
(e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that arc commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e. y 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxy oligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 2 3 -base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

1 0 35% (/. e. , the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of this 
embodiment, by no more than 20% (80% sequence identity) and in a further variation of this 
embodiment, by no more than 10% (90% sequence identity) and in a further variation of this 
embodiment, by no more that 5% (95% sequence identity)- Substantially equivalent, e.g. , 

20 mutant, amino acid sequences according to the invention preferably have at least 80% sequence 
identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more 
preferably at least 90% sequence identity, more preferably at least 95% identity, more preferably 
at least 98% identity, and most preferably at least 99% identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 

25 account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% sequence identity, more preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, and most preferably at least about 
95% identity, more preferably at least about 98% sequence identity, and most preferably at least 

30 about 99% sequence identity. For the purposes of the present invention, sequences having 

substantially equivalent biological activity and substantially equivalent expression characteristics 
are considered substantially equivalent. For the purposes of determining equivalence, truncation 
of the mature sequence (e.g. , via a mutation which creates a spurious stop codon) should be 
disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hein, J- 
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(1990) Methods Enzymol. 183:626-645). Identity between sequences can also be determined by 
other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 
5 The term "transformation" means introducing DNA into a suitable host cell so that the 

DN A is replicable, either as an extrachrornosomal element, or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

10 As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

15 with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

20 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-1 350 ; a polynucleotide encoding any one of the peptide 

25 sequences of SEQ ID NO: 1 35 1 -2700; and a polynucleotide comprising the nucleotide sequence 
encoding the mature protein coding sequence of the polypeptides of any one of SEQ ID 
NO: 1 35 1 -2700. The polynucleotides of the present invention also include, but are not limited to, 
a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the 
nucleotides sequences of SEQ ID NO.T-1350 ; (b) nucleotide sequences encoding any one of the 

30 amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species 
homolog of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 1351-2700. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

35 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 

14 

Printed from Mimosa 03/03/06 11:07:50 Page: 15 



WO 01/57188 



PCT/L SO 1/03800 



domains, or combinations thereof; domains in inamunoglobulin-like proteins include the variable 
immunoglobul in-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in Iigand polypeptides include receptor-binding 
domains. 

5 The polynucleotides of the invention include naturally occurring or wholly or partially 

synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
1 0 herein. The corresponding genes can be isolated in accordance with known methods using the 

sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 
be obtained using methods known in the art For example, full length cDN A or genomic DNA that 
1 5 corresponds to any of the polynucleotides of SEQ ID NO : 1 - 1 350 can be obtained by screening 

appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of 
the polynucleotides of SEQ ID NO: 1-1350 or a portion thereof as a probe. Alternatively, the 
polynucleotides of SEQ ID NOT-1350 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 
20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

25 The polynucleotides of the invention also provide polynucleotides including nucleotide 

sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, more typically at least about 90%, 91 %, 92%, 93%, 94%, and even more typically at 

30 least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO;l -1350, or complements thereof, which fragment is greater than about 5 
nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most 

35 preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that 
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are selective for (i.e. specifically hybridize to any one of the polynucleotides of the invention) 
are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
the same family of genes or can differentiate human genes from genes of other species, and are 
5 preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO : 1 - 1 3 50, a 
representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% 

1 0 identical, to SEQ IDNO:1-1350 with a sequence from another isolate of the same species. 

Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, 
in the coding region of an ORF, substitution of one codon for another codon that encodes the same 
amino acid is expressly contemplated 

1 5 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-1350, can be obtained by searching a database using an algorithm or a 
program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool is used to 
search for local sequence alignments (Altshul, Si 7 . J Mol. Evol. 36 290-300 (1993) and Altschul 
S.F. et al. J. Mol. Biol. 21 :403-410 (1 990)). Alternatively a FASTA version 3 search against 

20 Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

25 The invention also encompasses allelic variants of the disclosed polynucleotides or 

proteins; that is, naturally -occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
30 encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
35 polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
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acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
5 choices (e.g. , hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Arriino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include arriino- and/or carboxyl -terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

10 residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 

15 changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 

20 DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 

25 arriino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 

30 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
ainino acid sequence may be used in the practice of the invention for the cloning and expression 
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' of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

1 0 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-1350, or functional 
equivalents thereof, may be used to generate recombinant DNA molecules that direct the 
expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also 

1 5 included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

20 plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 

25 vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1350 or a fragment thereof or any other 

30 polynucleotides of the invention. In one embodiment, the recombinant constructs of the present 
invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having 
any of the nucleotide sequences of SEQ ID NO: 1-1350 or a fragment thereof is inserted, in a 
forward or reverse orientation. In the case of a vector comprising one of the ORFs of the' present 
invention, the vector may further comprise regulatory sequences, including for example, a 

35 promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are 
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known to those of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of example. 
Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs ICS, pNH8a, pNH16a, pNH18a, 
pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). 
5 Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein rccombinantly. Many 

10 suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 

1 5 (transfected) with the ligatcd polynucleotidc/expicssion control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

20 kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E, coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 

25 transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3 -phosphogly cerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 

30 periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 

35 signals in operable reading phase with a functional promoter. The vector will comprise one or 
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more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtil is, Salmonella typhimurium and various species 
within the genera Pseudomonas, Strepiomyces, and Staphylococcus, although others may also be 
5 employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 

10 Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 

15 additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. - 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech, 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 

20 against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

25 4.3 ANTISENSE 

Another aspect of the invention pertains to isolated anti sense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ED NO: 1-1350, or fragments, analogs or derivatives thereof. An "antisense" 
nucleic acid comprises a nucleotide sequence that is complementary to a "sense** nucleic acid 
30 encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA 

molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic acid 
molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 
100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ ID 
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NO: 1 35 1-2700 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 
NO: 1-1350 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
5 to the region of the nucleotide sequence comprising codons which are translated into arnino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3* sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5 f and 3' untranslated regions). 

10 Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g. , SEQ 

ID NO: 1-1350), antisense nucleic acids of the invention can be designed according to the rules 
of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of a mRNA, but more preferably is an oligonucleotide 
that is antisense to only a portion of the coding or noncoding region of a mRNA. For example, 

15 the antisense oligonucleotide can be complementary to the region surrounding the translation 

start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 
30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis or enzymatic ligation reactions using procedures known in 
the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 

20 chemically synthesized using naturally occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical stability 
of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate 
derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 

25 include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyl adenine, 1 -methylguanine, 1 -methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-memylg^ianine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 

30 7 -methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-memyltiu^-N6-isopentenyladerune, uracil- 5 -oxy acetic acid (v), wybutoxosine, pseudo uracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyIuracil, 
uracil-5 -oxy acetic acid methylester, uracil-5 -oxy acetic acid (v), 5-methyl-2-thiouracil, 

35 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 

21 
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anti sense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/'. e. , RNA transcribed from the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 
5 The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

10 an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic adrninistration, antisense molecules can be modified 

1 5 such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 

20 control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a n omeric nucleic acid molecule. An -anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual -uni ts, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 

25 antisense nucleic acid molecule can also comprise a 2 T -o-methylribonucleotide (Inoue et al. 

(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEBS Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

30 In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 

Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 

35 translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
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designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
1350). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which 
the nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a SECX-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071 ; and Cech et 
5 al. U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic RNA 
having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel et al., 
(1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 

10 structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N.Y. Acad Sci. 660:27-36; and 
Maher(1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 

15 solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
rnimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 

20 backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 

25 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

30 primers for DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA -DNA chimeras, or by the use of liposomes or other techniques of drug 
35 delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
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combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
5 the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5^(4_ me thoxytrityl)ammo-5'-deoxy-myrin^ine phosphoramidite, can be used between the PNA 
10 and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et aL (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5* DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem 
LettS: 1119-11124. 

15 In other embodiments, the oligonucleotide may include other appended groups such as 

peptides (e.g. , for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et aL, 1989, Proc. Natl. Acad Sci. U.S.A. 86:6553-6556; 
Lemaitre et aL, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g. , PCT Publication No. W089/101 34). In addition, 

20 oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Kiol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a. 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

25 

4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
30 methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
35 increase, expression of endogenous polypeptide. Cells can be modified (e.g. , by homologous 

24 



Printed from Mimosa 03/03/06 11:08:00 Page: 25 



WO 01/57188 



PCT/U SO 1/03800 



recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
5 No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 

Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g. , ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 

10 sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 

1 5 calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

20 Any host/vector system can be used to express one or more of the ORFs of the present 

invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 

25 be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 

30 York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23 : 1 75 (1 98 1). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 

35 (CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
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cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
5 site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 

nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 

10 more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 

refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 

1 5 agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 

20 strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 

strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

25 In another embodiment of the present invention, cells and tissues may be engineered to 

express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 

30 gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 

regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 

35 targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
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sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
5 gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 

enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell -type specificity than 

1 0 the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 

1 5 more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

20 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine- guanine 
phosphoribosyl -transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ak; International Application No. 

25 PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

30 The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1351-2700 or an 
amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO: 1-1 350 or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides preferably with biological or immunological activity that are encoded by: (a) a 

35 polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO: 1-1350 or (b) 
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polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 1351- 
2700 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) 
or (b) under stringent hybridization conditions. The invention also provides biologically active 
or immunologically active variants of any of the amino acid sequences set forth as SEQ ID 
5 NO: 135 1-2700 or the corresponding full length or mature protein; and "substantial equivalents" 
thereof (e.g., with at least about 65%, at least about 70%, at least about 75%, at least about 80%, 
at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at 
least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 
99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic variants 
1 0 may have a similar, increased, or decreased activity compared to polypeptides comprising SEQ 
ID NO: 135 1-2700. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 

15 U. Saragovi, et al, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amcr. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 

20 without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 

25 proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

30 The present invention further provides isolated polypeptides encoded by the nucleic acid 

fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 

35 acid fragments of the present invention are the ORFs that encode proteins. 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g. , ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:l 351-2700. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine- scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 

30 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g.* Invitrogen, San Diego, Calif, U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (2987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

10 invention is "transformed. " 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG<g>") is commercially 
available from Kodak (New Haven, Conn.)- 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

31 

Printed from Mimosa 03/03/06 11:08:05 Page: 32 



WO 01/57188 



PCT/US01/U3800 



The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et ah, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et ah, J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, IS MB -97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by- 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C- terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C -terminus of the GST (i.e. , glutathione 
S -transferase) sequences. 

In another embodiment, the fusion protein is an irnmunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the Ugand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e t g, cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended tenriini for ligation, restriction enzyme digestion to provide for 

30 appropriate temuni, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1 992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety {e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244; 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1 990); and Miller, Nature, 357: 455-460 (1992). introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

3 0 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g. , by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all ot part of a heterologous promoter so that the cells 
express the protein at higher levels. Hie heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA {e.g. , ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbaraylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co- amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which afreet the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include poly adenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DN A, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl- transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
• U. S. Patent No. 5,578,46 1 to Sherwin et al.; International Application No. PCT/US92/09627 
CWO93/09222)by Selden et al.; and International Application No. PCT/US 90/0643 6 

1 5 (WO 91/0666 7) by S koultchi et al . , each o f which i s incorporated by reference herein in its entirety . 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by adrninistration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g. , via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which biriding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another irnmune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 

15 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be adrmnistered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 3 2D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

10 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
etal.,I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottom Jy, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl, Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9— Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13. 1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al,, Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

1 0 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic 
fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and adrninistration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

15 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stern cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating 
utility, for example, in treating various anemias or for use in conjunction with 

1 5 irradiation/chemotherapy to stimulate the production of erythroid precursors andVor erythroid 
cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and 
monocytes/macrophages (/". e. , traditional CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and 
proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in place 
of or complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-1 5 1 , 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem ceil survival and differentiation (which will identify, among others, 
proems that regulate lympho-hemaiopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells' 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming ceil assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 



4.10.6 TISSUE GROWTH ACTTVTTY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the 
repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and 
also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing diiferentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or ligament 
5 defects in humans and other animals. Such a preparation employing a tendon/ligament-like 
tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament 
tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and 
in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation 
induced by a composition of the present invention contributes to the repair of congenital, trauma 

10 induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic 
surgery for attachment or repair of tendons or ligaments. The compositions of the present 
invention may provide environment to attract tendon- or ligament-forming cells, stimulate 
growth of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament- forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for 

15 return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 

20 cells and for regeneration of nerve and brain tissue, *. e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

25 system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 

30 composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 

35 regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
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kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
5 A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
10 growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. W09 5/05846 (nerve, neuronal); International Patent Publication No. 
1 5 W09 1 /0749 1 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. L and Rovee, D. 1\, eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eagl stein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 

25 protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g. , in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 

30 specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i. e. , in the treatment of cancer. 
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Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mel litis, myasthenia gravis, graft-versus-host 
5 disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 

10 Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

1 5 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 

models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et ah, Allergy 54: 446-54, 1 999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

20 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 

25 generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen- specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 

30 of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft- versus-host disease (GVHD). For example, blockage of T cell 

35 function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
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transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of co stimulation may also be sufficient 
5 to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen- blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular therapeutic compositions in preventing organ transplant 

1 0 rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 

15 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 

20 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
ceil -derived cytokines which may be involved in the disease process. Additionally, blocking 

25 reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 

long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well -characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus eiythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 

30 collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 

myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1 989, pp. 
840-856). 

Upregulation of an antigen function {e.g. , a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
35 responses may be in the form of enhancing an existing immune response or eliciting an initial 
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immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and 
encephalitis. 

Alternatively, an ti- viral immune responses may be enhanced in an infected patient by 
5 removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 

10 invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation signal to T 
cells to induce a T cell mediated immune response against the transfected tumor cells. In 

15 addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 

reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 
MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

20 proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

25 of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

30 Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscicnce (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.1 9; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 

35 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
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Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 
Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol- 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
5 will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

10 Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 

that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

15 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al, J. Immunol. 149:3778-3783, 1992. 

Dendritic cell -dependent assays (which will identify, among others, proteins expressed 
by dendritic cells that activate naive T-cells) include, without limitation, those described in: 
Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 

20 173:549-559, 1 991 ; Macatonia et al., Journal of Immunology 1 54:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 1 82:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

25 Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 

that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 

30 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 

35 Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10-8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
5 characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 

stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 

10 spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 

1 5 polypeptide of the invention may also be useful for advancement of the onset of fertility in 

sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

20 Assays for activin/inhibin activity include, without limitation, those described in: Vale et 

al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

25 4.10.9 CUEMOTACTIC/CIIEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 

30 receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in Improved 

35 immune responses against the tumor or infecting agent. 
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A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
5 determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
10 population to another cell population. Suitable assays for movement and adhesion include, 

without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Inter science (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
15 1995; Mullcr et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. Jf. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
20 thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 

attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
25 treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
30 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
35 metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
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invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
5 condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 

10 compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 

15 cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 

cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, uro logic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

20 kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

25 Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 

30 effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be adrninistered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 

35 acceptable carrier for delivery. The use of anti -cancer cocktails as a cancer treatment is routine. 
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Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D 7 Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Car bop latin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
5 Daunorubicin HO, Doxorubicin HO, Estramustine phosphate sodium, Etoposide (VI 6-21 3), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxy carbamide), If osf amide, 
Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen* mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HO, Octreotide, Plicamycin, Procarbazine HC1, 

10 Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmel amine, Interleukin-2, Mitoguazone, Pento statin, 
Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations {e.g. 

1 5 exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 
effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 

20 cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1 987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 1 8 and Ch 21), 
tumor systems in nude mice as described in Giovanella et ah, J. Natl. Can. Inst., 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

25 of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

30 4.10.12 RECEPTOR/LI G AND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/1 igand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
35 their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
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and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
5 receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

10 Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A, M. Kruisbeek, D. H. Margulies, E. M. 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1 987; Bierer et aL, J.Exp. Med. 168:1145-1156, 1988; 

15 Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stittet al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

20 overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

25 Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

30 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
35 solid support, borne on a cell surface or located intracellularly. One method of drug screening 
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utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of 
complexes between polypeptides of the invention or fragments and the agent being tested or 
examine the dirninution in complex formation between the novel polypeptides and an 
appropriate cell line, which are well known in the art 

Sources for test compounds that may be screened for ability to bind to or modulate (z.e. , 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mot Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):114-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 
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The binding molecules thus identified may be complexed with toxins, e.g. , ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
5 complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 

1 0 previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 

1 5 of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 
Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 

20 invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 

25 inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 

30 protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications I e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 
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4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflarnmatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
5 cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 

10 shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 

1 5 Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 

20 acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
25 therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
30 Fishman et al., 1985, Medicine, 2d Ed, J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
35 polypeptides of the invention, and which can be treated upon thus observing an indication of 
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therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
5 limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

10 (ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 

results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by human 

15 immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

20 sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of the 
nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 

25 callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not limited to 
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
30 neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus- associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 
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Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 
5 (i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

(iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 
choline acetyltransf erase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

1 0 Such effects may be measured by any method known in the art. In preferred, 

non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neuroscu 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 

15 be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, ere, 

depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

20 invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

25 muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

30 A polypeptide of the invention may also exhibit one or more of the following additional 

activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 

35 (such as, for example, breast augmentation or diminution, change in bone form or shape); 
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effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co -factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in tineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
10 hyperproliferative disorders (such as, for example, psoriasis); irnmunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

1 5 4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmaco genetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
20 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

25 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

30 allele- specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

35 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
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absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
5 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

10 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 

15 Induction of the disease can be caused by a single injection, generally intradennally, of a 

suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CF A). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
I -5 mg/kg. The control consists of administering PBS only. 

20 The procedure for testing the effects of the test compound would consist of intradennally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 

25 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
30 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 
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One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
5 exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01 ng/kg to 100 mg/kg of body weight, with 

10 the preferred dose being about 0. 1 ug/kg to 1 0 mg/kg of patient body weight. For parenteral 

administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharrnaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 

15 additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

20 A protein or other composition of the present invention (from whatever source derived, 

including without limitation from recombinant and non-rccombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 

25 may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 

30 invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, 1 FN, TNF0, TNF 1 , TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 

35 include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
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factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
5 treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti -thrombotic factor, or anti- 

1 0 inflammatory agent to inmimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 

15 invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
adniinistered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 

20 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 

25 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

30 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

35 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
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hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s) J other hematopoietic 
factor (s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
5 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transrnucosal, or 
10 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
15 ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may adrninister the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often 
in a depot or sustained release formulation. In order to prevent the scarring process frequently 
20 occurring as complication of glaucoma surgery, the compounds may be administered topically, 
-for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

25 The polypeptides of the invention are administered by any route that delivers an effective 

dosage to the desired site of action. The determination of a suitable route of adrninistration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

30 similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
35 be formulated in a conventional manner using one or more physiologically acceptable carriers 
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comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutical ly. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
5 Iyophilizing processes. Proper formulation is dependent upon the route of administration 
chosen. When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered orally, protein or other active ingredient of the present 
invention will be in the form of a tablet, capsule, powder, solution or elixir. When adiriirustered 
in tablet form, the pharmaceutical composition of the invention may additionally contain a solid 

10 carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 
95% protein or other active ingredient of the present invention, and preferably from about 25 to 
90% protein or other active ingredient of the present invention. When administered in liquid 
form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, 
mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 

1 5 pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

20 When a therapeutically effective amount of protein or other active ingredient of the 

present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 

25 the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 

30 also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's 
solution, or physiological saline buffer. For transmucosal adniinistration, penetrants appropriate 
to the barrier to be permeated are used in the formulation. Such penetrants are generally known 

35 in the art. 
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For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutic ally acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
5 treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 

10 gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 

carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 

15 talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyes tuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 

20 gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 

sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 

25 stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal acuriinistration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For adrnini strati on by inhalation, the compounds for use according to the present 
invention are conveniendy delivered in the form of an aerosol spray presentation from 

30 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount Capsules and cartridges of, e.g., gelatin for use 
in an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

35 suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
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administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g. , in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
5 stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 

10 triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 

15 suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 

20 implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 

25 system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 

30 system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 

35 biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
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sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethyl sulfoxide also may be employed, although usually at the cost of greater toxicity. 
5 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 

10 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

15 polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylarnine, dialkylamine, 

20 monoalkyl amine, dibasic amino acids, sodium acetate, potassium benzoate, triethanoi amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

25 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class 1 and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 

30 MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 

35 which protein of the present invention is combined, in addition to other pharmaceutically 
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acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, Liquid crystals, or lamellar layers in aqueous solution. Suitable 
lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
5 liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 

10 the condition being treated, and on the nature of prior treatments which the patient has 

undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or. other active ingredient 

15 of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0. 1 u.g to about 10 mg, more preferably 
about 0.1 \xg to about 1 mg) of protein or other active ingredient of the present invention per kg 

20 body weight For compositions of the present invention which are useful for bone, cartilage, 

tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen- free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 

25 delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 

30 cartilage formation, the composition would include a matrix capable of delivering the 

protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 
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The choice of matrix material is based on bi ©compatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
5 hydroxyapatite, polylactic acid, polyglycolic acid and poly anhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 

10 mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 

1 5 In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as aikylcelluloses 
(including hydroxy aikylcelluloses), including methylcelmlose, ethylcellulose, 

20 hydroxy ethyl cellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 

carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 

25 total formulation weight, which represents the amount necessary to prevent desorption of the 

protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with other 

30 agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 

question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transfomiing growth factors (TGF-a and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 

35 Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
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patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containirig pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
5 damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue {e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution 
and with inclusion of other proteins in the pharmaceutical composition. For example, the 
addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final 

10 composition, may also effect the dosage. Progress can be monitored by periodic assessment of 
tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and 
tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 

1 5 mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

20 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 

25 effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Deteirnination of the effective amount is well vWthin the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 

30 circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately detennine useful doses in humans. 
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A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g. , for determining the LD50 (the dose lethal to 50% of the 
5 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

10 of circulating concentrations that include the ED50 with litde or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of adrninistration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et ah, 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. I p.l. Dosage amount and interval may be adjusted 

1 5 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

20 Dosage intervals can also be determined using MEC value. Compounds should be 

adrninistered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

25 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 u-g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

30 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 
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The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
5 invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 

10 invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, Fab- and F(ab-)2 
fragments, and an F a b expression library. In general, an antibody molecule obtained from 

15 humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

20 An isolated related protein of the invention may be intended to serve as an antigen, or a 

portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecificaUy bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

25 antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, (for example the amino acid sequence shown in SEQ ID NO: 1351), 
and encompasses an epitope thereof such that an antibody raised against the peptide forms a 
specific immune complex with the full length protein or with any fragment that contains the 
epitope. Preferably, the antigenic peptide comprises at least 1 0 amino acid residues, or at least 

30 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 
Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface- commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 

35 hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
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indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 

to encode surface residues useful for targeting antibody production. As a means for targeting 

antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

may be generated by any method well known in the art, including, for example, the Kyte 
5 Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g. , 

Hopp and Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 

Mol. Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 

Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 

fragments, analogs or homologs thereof, are also provided herein. 
10 A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

thereof, may be utilized as an imrnunogen in the generation of antibodies that 

immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 

monoclonal antibodies directed against a protein of the invention, or against derivatives, 
1 5 fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 

Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 

Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

20 For the production of polyclonal antibodies, various suitable host animals (e.g. , rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

25 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

30 limited to, Freund's (complete and incomplete), mineral gels (e.g., durninum hydroxide), surface 
active substances (e.g. , lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

35 synthetic trehalose dicorynomycolate). 
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The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
5 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

10 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody ** (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity deterniining regions (CDRs) of the monoclonal 

15 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 

binding site capable of irrimunoreactirig with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 

20 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the irnmunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or a 
fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human 

25 origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources 
are desired. The lymphocytes are then fused with an immortalized cell line using a suitable 
fusing agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal 
Antibodies: Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell 
lines axe usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and 

30 human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 
be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 

35 thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody -producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
5 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol. . 133 :3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications , Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

10 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

15 art. The binding affinity of the monoclonal antibody can, for example, be detennined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. J07:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

20 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RP MI- 1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such 

25 as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, 
dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

30 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

35 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
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example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-irnmunoglobulin 
5 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

10 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for adrninistration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab02 or other antigen- 

15 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature. 321 :522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science . 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

20 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

25 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol. . 

30 2:593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
35 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
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Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
5 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

10 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227 :381 (1991); 
Marks et al., J. Mol. Biol. , 222 :581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g. , mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

15 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368 , 812-13 (1994)); FishwiJd et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14. 826 (1996)); and 

20 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

25 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

30 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B 
cells which secrete fully human immunoglobulins. The antibodies can be obtained directly from 
the ariimal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

35 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

79 

Printed from Mimosa 03/03/06 11:08:48 Page: 80 



WO 01/57188 



PCT/U SO 1/03800 



immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
5 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

10 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

15 an expression vector containing a nucleotide sequence encoding a light chain into another 

mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an irnmunogen, and a correlative method for selecting an antibody that binds 

20 immuno specifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
25 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g. , 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
30 may be produced by techniques known in the art including, but not limited to: (i) an.F^ 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F^^ fragment; (iii) an F a b fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

35 5.13.5 Bispecific Antibodies 
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Bi specific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell- surface protein or receptor or receptor subunit. 
5 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-criain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature , 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

10 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker etal., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

1 5 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 

20 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 

host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
ah, Methods in Enzvmology, 121:210 (1986). 

According to another approach described in WO 96/2701 1 , the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

25 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

30 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

35 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
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wherein intact antibodies are proteolyticaUy cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
5 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab* fragments can be directly recovered from E. coli and chemically 

10 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab*)2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

15 of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al, J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 

20 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

25 heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 

30 reported. See. Gruber etal.. J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti -antigenic arm of an 

35 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
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a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG(Fc R),suchasFc RI (CD64), Fc RII (CD32) and Fc R III (CD 16) so as to focus 
cellular defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies 
can also be used to direct cytotoxic agents to cells which express a particular antigen. These 
antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a 
radionuclide chelator, such as EOTUBE, DPT A, DOTA, or TETA. Another bispecific antibody 
of interest binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroc on jugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crossl inking agents. For example, irnmunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobi functional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1 993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
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bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such irnmunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
5 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogcllin, restrictocin, phcnomycin, enomycin, and the tricothecenes. A variety of 

10 radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, m I, m m, 90 Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succmimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

1 5 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-emylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 

20 Carbon- 1 4-Iabeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
25 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then adrninistration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

30 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

35 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
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artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
5 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

1 0 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

1 5 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1350 or a representative 
fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide 

20 sequences of SEQ ID NO: 1-1350 in computer readable form, a skilled artisan can routinely 
access the sequence information for a variety of purposes. Computer software is publicly 
available which allows a skilled artisan to access sequence information provided in a computer 
readable medium. The examples which follow demonstrate how software which implements the 
BLAST (Altschul et al., J. Mol. BioL 215:403-410 (1990)) and BLAZE (Brutlag et aL, Comp. 

25 Chem. 17:203-207 (1993)) search algorithms on a Sybase system is used to identify open reading 
frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein encoding fragments 
and may be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 

30 means, and data storage means used to analyze the nucleotide sequence information of the 

present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 

35 computer-based systems of the present invention comprise a data storage means having stored 
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therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
5 the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 

10 motif A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 

15 software packages for conducting homology searches can be adapted for use in the present 

computer-based systems. As used herein, a "target sequence' 1 can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 

20 sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
25 selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
30 sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
35 methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
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Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et aL, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Olinno, J. Neurochem. 
5 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA 
transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA 
molecule into polypeptide. Both techniques have been demonstrated to be effective in model 
systems. Information contained in the sequences of the present invention is necessary for the 
1 0 design of an antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homo log thereof, in a test sample, using a nucleic 

1 5 acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 

20 detected, a polynucleotide of the invention is detected in the sample. Such methods can also 

comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

25 In general, methods for detecting a polypeptide of the invention can comprise contacting 

a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 

30 antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 

35 skilled in the art will recognize that any one of the commonly available hybridization, 
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amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in Immunocytochemistry, 
5 Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 

1 0 will vary based on the assay format, nature of the detection method and the tissues, cells or 

extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 

15 necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

20 In detail, a compartment kit includes any kit in which reagents are contained in separate 

containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 

25 compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 

30 primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 

reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

35 4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutical ly acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 

encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 1- 

1350, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In detail, said 

method comprises the steps of: 
1 5 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypep tide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodi ester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-1 3 50. Because the corresponding gene is only expressed in a limited 
15 number of tissues, a hybridization probe derived from of any of the nucleotide sequences SEQ 
ID NO: 1-1350 can be used as an indicator of the presence of RNA of cell type of such a tissue in 
a sample. 

Any suitable hybridization technique can be employed, such as, for example* in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow- sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

10 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1985; Dahlen et al, 1 987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1989); all 
20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these arc duplex probes, that are immobilized on 
streptavi din-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, 
25 Oslo, Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
S'-end by a phosphoramidatebond, allowing immobiUzationof more than 1 pmol of DNA 
(Rasmussen*?/ al, (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et ah, (199 1 ). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1 (8) 65 13-29) . This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLinkNH via an phosphoramidate bond, the oligonucleotide terminus must have a 5 '-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidinused to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7. 5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M l-methylimidazole, 
pH 7.0 (1-Melm7), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M 1 -emyl-3-(3-o^emyIammopropyl)-carbodiimide (EDC), dissolved in 

15 10 mM 1-Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g. , Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03 3 82 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3' -reagent through the phosphate group by a covalent phosphodi ester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodor eta!. (1 99 1 ) Science 251 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al. (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
1 69 ( 1 ) 1 04-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness etal (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protected AT-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 2 56 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosrnid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook etal. (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9 J24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of 
these studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753 -62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CvfJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI * * ), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 1 9 (2688 base pairs). Fitzgerald et al (1 992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CviJI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cvi JI* * restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4 2,2 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96- well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor- storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

10 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences 
which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and 
screened with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were 
clustered into groups of similar or identical sequences. Representative clones were selected for 

30 sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
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(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDN A Ends) was performed to further extend the sequence in the 5 ' direction. 

5.2 EXAMPLE 2 

5 Novel Contigs 

The novel contigs of the invention were assembled from sequences that were obtained from 
a cDNA library by methods described in Example 1 above, and in some cases sequences obtained 
from one or more public databases. The sequences for the resulting nucleic acid contigs are 
designated as SEQ ID NO: 1 -1 350 and are provided in the attached Sequence Listing. The contigs 

1 0 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 114, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 

1 5 component sequences into the assemblage was based on a BLASTN hit to the extending 
assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Table 3 sets forth the novel predicted polypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO: 1 89-282) of the present invention, and their corresponding 
nucleotide locations to each of SEQ ID NO: 1 89-282. Table 3 also indicates the method by which 

20 the polypeptide was predicted. Method A refers to a polypeptide obtained by using a software 

program called FASTY (available from http://fasta.bioch.virginia.edu) which selects a polypeptide 
based on a comparison of the translated novel polynucleotide to known polynucleotides (W.R. 
Pearson, Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B 
refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate 

2 5 sequences (availabl e from Stanford Uni versity, Offi ce of Technol ogy Li censing) that predi cts the 
polypeptide based on a probabilistic model of gene structure/compositionalproperties (C. Burge 
and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). Method C refers 
to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel 
polynucleotide and its complementary strand into six possible amino acid sequences (forward and 

30 reverse frames) and chooses the polypeptide with the longest open reading frame. 

The nearest neighbor results for SEQ ID NO: 1-1350 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq database October 
12, 2000, update 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the 
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closest homologue for SEQ ID NO: 1-1 350. The nearest neighbor results for SEQ ID NO: 1- 
1350 are shown in Table 2 below. 

Tables 1, 2 and 3 follow. Table 1 shows the various tissue sources of SEQ ID NO: 1-1350. 
Table 2 shows the nearest neighbor result for the assembled contig. The nearest neighbor result 
shows the closest homolog with an identifiable function for each assemblage. Table 3 contains the 
start and stop nucleotides for the translated amino acid sequence for which each assemblage 
encodes. Table 3 also provides a correlation between the amino acid sequences set forth in the 
Sequence Listing, the nucleotide sequences set forth in the Sequence Listing and the SEQ ED NO. in 
USSN 09/496,914. 
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TABLE 1 



Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQ ID NOS: 


adult brain 


GIBCO 


AB300I 


lit 151 188 215 662-665 877 910 927 
976 1233 1319 


adult brain 


GIBCO 


ABD003 


41 49 74 101 111 120 132 141-142 151 
217 225 238 271 317 404 446 469 503 
513-514 535 550 564 573 666-669 798 
898 910 927 976 1067 1083 1 085 1 178 
1254 


adult brain 


Clontech 


ABR001 


39 216 238 327 356 535 927 1056 1 121 
1178-1180 1199 1251 


adult brain 


Clontech 


ABR006 


74 611 949 1034 1136 


adult brain 


Clontech 


ABR008 
• 


14 32 41 61 81 86 89 120 132 138 145 
147 1 88 197 208 225 227-239 250 300- 
303 3 12 316 328-331 340 357-362 374 
380 384-391 408 414 446 448 464-467 
483 488 495^96 505 512 521 535 550 
566 571 577 585 590 594 598 634 641 
658 666 683 725 742 764 767 786 801 
805 810 823 826 829 831 836 841 887- 
923 927 934 943 950-951 963 976 995 
1000-1001 1006 1026 1034 1048 1057- 
1067 1086 1088 1090 1118 1120 1122- 
1 128 1142 1162 1181-1192 1199 1204 
1218-1219 1225 1232 1253 1267 1271- 
1306 1342 1347 1349-1350 


adult brain 


Clontech 


ABR011 


49 238 1219 


adult brain 


BioChain 


ABR012 


74 238 


adult brain 


Invitrogen 


ABR013 


868 1268 


adult brain 


Invitrogen 


ABT004 


49 1 17 138 191 217 252 291 305 535 
566 596 663 670 746 798 816-819 876 
892 898 922 943 963 1034-1036 1 121 


cultured 
preadipocytes 


Strategene 


ADP001 


41 74 101 138 21 1 238 304 537 582 
740 798 883 943 976 1067 


adrenal gland 


Clontech 


ADR002 


49 74 101 111 120 127 151 215 238 
240-247 316 330 363-364 404 414 534- 
53 5 83 3 924-940 950 963 976 1001 
1003 1067-1070 1118 1156 1193-1200 

1325 


adult heart 


GIBCO 


AHR001 


38 49 71-72 74-77 79 92 99 101 111 
118 129 132 138 151 158-163 182 195- 
203 215 217 238 264 269 353 384 398 
408 434^139 446 504 512-513 519 537 
562-573 577 61 1-614 616-619 658 661 
671-672 722 734 757-773 815 828-835 
874 891 898 919 926-927 976 988 
1021 1037 1041 1062 1067 1071 1080 
1083 1093 1122 1131 1185 1201 1254 
1308 1331 1335 


adult kidney 


GIBCO 


AKD001 


41 49 51 71-74 78-85 94 100-101 103- 
107 111 119-120 138 151 157215217- 
218 238 250 264294 304 384 404 440 
446 454 477 504-505 509 514 518-519 
535 537 564 574-583 620-627 639 653 
673-675 705 753 789 831 844 851 859 
877 909 918 927 956 963 976 1067 
1074 1083 1095 1 178 1302 1331 1335 


adult kidney 


Invitrogen 


AKT002 


11-12 41 49 111-112 215-217 294 316 
446 487 564 575 844 868 910 927 976 
1116 


adult lung 


GIBCO 


ALG001 


8 101 111 151 187 402 446 490 514 
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Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQ ID NOS: 








5 18 537 545 549 580 582 592 594 634 
640 651-652 676-678 725 851 873 918 
952 976 1042 1067 1076 1083 1152 


lymph node 


Clontech 


ALN001 


8 111 121 151 180-182 188215 537 
545 549 65 1 679-682 789 804-810 868 
873 927 952 976 1042 1059 1335 


young liver 


GIBCO 


ALV001 


8 64 79 111 186 215-216238 446 514 
5 19 537 564 653 683-684 698 753 798 
813 833 840 858 927 976 1038-1039 
1051 1085 1224 1245 1256 


adult liver 


Invitrogen 


ALV002 


40 71 292-293 305 384 468-469 496 
505 657 675 714 753 832 844 941-942 
976 1040 1076 1256 1293 


adult liver 


Clontech 


AI.V003 


976 


adult ovary 


Invitrogen 


AOV001 


8 32 36 38 41 49 51 71 74 79-80 101 
104 111 120 122-125 138 140 143-149 
151 188-190 207-212 215-217 238 264 
3 16 384 409 440 445-446 496 504 512 
514 518-519 535 537 549-550 564 566 
571 580 582 600 618 638 657 667 681 
685-697 699 705 722 735-744 761 771 
815 833 842-865 868 875-876 918 926- 
927 950 952 963 976 1 023 1042 1048 
1051 1059 L072 1076 1083 1117 1120 
1124 1131 1144 1174 1224 1268 1331 
1335 


adult placenta 


Clontech 


APL00I 


102 217 238 537 641 700 


placenta 


Invitrogen 


APL002 


663 851 1048 


adult spleen 


GIBCO 


ASP001 


8 45 74 111 132 140 1 51 185 217 238 
294 414 446 477 504 514 534 545 549 
592 722 873 883 952 976 1041-1042 
1083 1093-1094 1152 1224 


testis 


GIBCO 


ATS001 


72 107 1 1 1 1 13 126 140 151 1 83 215 
238 446 497 537 642 701-706 81 1 877 
927 962 976 1083 1117 1131 


adult bladder 


Invitrogen 


BLD001 


41 151 191402-405 409 414 496 545 
592 607 706 873 952 1 178 1329-1335 


bone marrow 


Clontech 


BMD001 


8 58-62 65-68 74 79 108 111 116 137 
147 151 164-174 213-215 238 305-307 
374 404 446 460 466 5 16 519 534 538- 
541 544-546 549-554 566 584 586 592 
596 607 610 628-629 643-645 652 707- 
708 774-789 844 866-871 873 919 927 
952 963 976 998 1034 1042 1064 1083 
1085 1120 1132 1152 1225 1229 1268 
1307 1310 


bone marrow 


Clontech 


BMD002 


6 8 37-38 52 74 77 105 111 129 132 
210 317 510-511 545 549 581 598 628 
638 724 766 789 844 860 868 873 919 
927 952 963 968 976 1042 1111 1141 
1160-1161 1229 1266 1346 


bone marrow 


Clontech 


BMD004 


111 23 8 282 549 1083 


adult colon 


Invitrogen 


CLN00I 


52 260 264 299 494 536 545 564 592 
844 873 877 952 976 1042 1 152 1268 
1336-1337 


adult cervix 


BioChain 


CVX001 


49 51 129 132 151 205 207 238 332- 
335 365-367 392-401 440 466 470-471 
518 537 597 629 832 877 927 976 1006 
1085 1 117 1 129-1 134 1 192 1202-1205 
1219 1309-1328 


diaphragm 


BioChain 


DIA002 


74 976 1083 
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Tissue Origin 


RNA Source 


Myseq Library Name 


SEQ ID NOS: 


endothelial cells 


Strategene 


ED TOO 1 


32 40-41 49 74 79 101 1 11 120 132 
138 151 204-206 215-217 238 269 316 
414 433 505 510 513 550 555 580 582 
596 675 722 745 798 814 836-841 851 
918 976 1041 1043 1073 1083 1131 
1331 


Genomic clones 
from the short arm 
of chromosome 8 


Genomic DNA 
from Genetic 
Research 


EPM001 


525-532 927 


Genomic clones 
from the short arm 
of chromosome 8 


Genomic DNA 
from Genetic 
Research 


EPM003 


47 525 


Genomic clones 
from the short arm 
of chromosome 8 


Genomic DNA 
from Genetic 
Research 


EPM004 


525 927 


Genomic clones 
from the short arm 
of chromosome 8 


Genomic DNA 
from Genetic 
Research 


EPM005 


531 


esophagus 


BioChain 


ESO002 


74 13 8 238 


fetal brain 


Clontech 


FBR001 


441-442 927 


fetal brain 


Clontech 


FBR004 


215 893 927 1001 


fetal brain 


Clontech 


FBR006 


48 61 101 120 132 138 140 147208 
225 271 317 319 336 359 368 405-414 
519 550 571 594 686 715 722 764 824 
829 836 859 909 927 943 947 963 1057 
1067-1068 1104 1135-1140 1162 1206- 
1207 1235 1268 1288 1307-1308 1319 
1338-1350 


fetal brain 


Clontech 


FBRs03 


11 1 446 


fetal brain 


Invitrogen 


FBT002 


41 5] 120 151 192-194 264 504 512 
535 683 761 798 820-827 844 876 909 
963 976 1026 1048 1083 1 144 1302 


fetal heart 


Invitrogen 


FHR001 


446 566 761 


fetal kidney 


Clontech 


FKD001 


51 74 111 127 140 151 184 294 537 
550 630-63) 1319 


fetal kidney 


Clontech 


FKD002 


111 976 1083 


| fetal kidney 


Invitrogen 


FKD007 


238 974 


fetal lung 


Clontech 


FLG001 


463 566 976 1074 1083 1093 


fetal lung 


Invitrogen 


FLG003 


41 238 330 407 415-416 537 573 844 
859 1048 1083 1116 1192 


fetal liver-spleen 


Columbia 
University 


FLS001 


g 14 34-35 37 41 43 49 51 54-56 63-64 
69-71 74 77 79 87-90 101 107 1 1 0- 1 1 1 
114 120 128-131 138 140 147 150-155 
197 210 215 217 225 238 312 367 384 
414 440 446 460 468 483 496 504-507 
511-515 518-519 523 533-535 537 541 
544-545 547-550 555-560 564 566 571 
577 582 585-586 598 636 646-647 649 
652 664 698 709-710 714 722-723 731 
735-736 746-753 761 784 798 823 829 
832 844 851 858-859 868 873 876 898 
927 943 949 952 963 976 984 1002 
1021 1023 1040 1042 1044 1050 1083 
1093 II 16 1120 1129 1 131 1144 1174 
1217 1251 1254 1256 1302 1308 1311 
1319 


fetal liver-spleen 


Columbia 
University 


FLS002 


8 36-37 41-46 49 54 64 71 74 79 101 
111 120 129 147 207 210 215-216 238 
250 330 353 359 366 383-384 414 478 
505 508-509 51 1 515-524 534-535 537 
544-545 564 566 571 577 591 598 638 
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Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQ ID NOS: 








663 671 698 714 722 725 727 751 798 
851 859 873 876 909 927 949 952 983- 
984 1002 1023 1042-1044 1085 1095 
1131 1144 1178 1199 1233 1240-1270 
1331 1340 


fetal liver-spleen 


Columbia 
University 


FLS003 


64 535 976 1256 


fetal liver 


Invitrogen 


FLV001 


8 101 120 138 217 446 468 535 566 
580 722 730 749 844 918 943 976 1051 
1256 1331 


fetal liver 


Clontech 


FLV004 


537 926 1256 


fetal muscle 


Invitrogen 


FMS001 


51 111 264 312 369-370 404 417-421 
425 535 537 577 598 614 836 857 1 141 
1208 1268 


fetal muscle 


Invitrogen 


FMS002 


537 


fetal skin 


Invitrogen 


FSK001 


13-26 32 41 51 89 107 111 147 151 
225 264 316 405 422-429 488-494 496 
5 19 534-535 537 566 675 732 859 876- 
877 898 947 949-950 963 976 1 00 1 
1062 1076 1083 1117 1144 1165 1268 
1281 


fetal skin 


Invitrogen 


FSK002 


537 812 


fetal spleen 


BioChain 


FSP001 


87 549 


umbilical cord 


BioChain 


FUC001 


27-33 41 49 15 1 215 23 8 248-249 301 
3 16 446 495-503 519 521 534-535 537 
582 634 691 877 883 927 944-950 963 
976 1001 1075 1142-1143 1171 1218 
1243 1308 


fetal brain 


GIBCO 


HFB001 


41 49 57 79 87 103 1 1 1 120 132-135 
138 145 151 188 197 207 215 238 264 
271 294 316 367 414 440 446 466 504 
513-514 535 542-543 550 564 571 596 
635 648-654 675 711-715 722-723 798 
832 872 876 883 927 976 1095 1 144 
1168 1171 U78 1211 1335 


macrophage 


Invitrogen 


HMP001 


238 


infant brain 


Columbia 
University 


IB2002 


49-50 77 81 89 105 111 136-138 140 
151 161 175-179 185 216-217 264 295 
299 308-310 371-373 462 476 504 511- 
513 533 537 564 566 571 655-657 662 
683 716-720 723 752 790-803 829 832 
858-859 876 898 909 949 976 1045- 
1047 1076-1087 1090 1093 1 1 16 1122 
1144 1209-1213 1225 1233 1256 1319 
1341 


infant brain 


Columbia 
University 


IB2003 


41 50 77 104 132 215 238 508 5 12-513 
519 566 655 714 794 918 943 976 1067 
1092-1093 1233 


infant brain 


Columbia 
University 


IBM002 


311 472-473 753 1214 


infant brain 


Columbia 
University 


IBS001 


51 1 11 376 474 790 876 949 1 1 44 1204 
1221 


lung , fibroblast 


Strategene 


LFB001 


151 316 462 514 534 582 675 939 1131 


lung tumor 


Invitrogen 


LGT002 


1-7 41 74 79 94 1 15 120 138-139 156 
2 1 5 217 269 280 296 337 374-375 384 
404 446 454 475-480 498 514 518-519 
522 537 545 564 577 597 653 658 705 
721-724 754-756 779 859 868 872-874 
876-877 919 927 949 951-952 959 976 
1002 1042 1048-1053 1076 1083 1088- 
1089 1131 1144-1147 1216-1218 1229 
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Tissue Origin 


RNA Source 


Hyscq Library Name 


SEQ ID NOS: 








1293 131 1 


lymphocytes 


ATCC 


LPC001 


41 74 111 132 151 253 316 446 550 
634 844 927 976 1085 1268 


leukocyte 


Groco 


LUC001 


8 11 41 74 86 91-98 101 109 111 120 
147 151 212 215 218 238 252 288 312- 
314 316 338 359 408 427 443-447 505 
5 10 5 12 5 14 5 1 8 534 545 549-550 561 
564 566 571 577 580 582 587-609 615 
632-638 658-659 698 714 725-728 832 
836 841 859 866 873-874 882-883 918- 
919 927 943 952 963 976 1042 1076 
1083 1090 1148 1152 1168 1195 1219- 
1220 1224 


leukocyte 


Clontech 


LUC003 


74 100 215 232 238 339-341 446 545 
657 660 729 873 883 927 952 963 1008 
1042 1116 1120 1149-1150 1215 1222 


Melanoma from cell 
line ATCC #CRL 
1424 


Clontech 


MEL004 


210 215 238 342 534 545 592 722 873 
919 929 939 952 976 1071 1118 1218 
1235 1245 


mammary gland 


Invitrogen 


MMG001 


8-10 40-41 49 73 80 114 138-140 147 
217 250-256 264 297-299 305 377-378 
398 446 481-486 505 512 537 545 549 
571 592 725 730-733 816 829 836 844 
868 873 876-877 898 926 943 951-960 
963 976 995 1034 1042 1048 1054- 
1055 1076 1083 1091 1093 1116-1117 
1124 1152 1302 


induced neuron cells 


Strategene 


NTD001 


39 101 111 138 238 361 1225 1251 
1319 


retinoid acid induced 
neuronal cells 


Stratcgene 


NTR001 


74 225 976 


neuronal cells 


Strategene 


NTU001 


129 225 238 304 313 361 657 976 


pituitary gland 


Clontech 


P1T004 


976 


placenta 


Clontech 


PLA003 


38 976 


prostate 


Clontech 


PRT001 


111 188 238 257-25 8 564 724 961-966 
1067 1095 


rectum 


Invitrogen 


REC001 


238 430^431 841 859 868 963 1001 
1116 


salivary gland 


Clontech 


SAL001 


8 15 1 402 432-433 446 496 868 952 
976 1083 1120 1151 1184 


small intestine 


Clontech 


SIN001 


8 101 147 215 259-266446 462 505 
545 592 660 789 836 866 873 927 952 
963 967-978 1042 1120 1152 1223- 
1224 


skeletal muscle 


Clontech 


SKM00I 


238 302 927 943 992 1031 


spinal cord 


Clontech 


SPC001 


74 1 11 132 151 215-216 238 264 267- 
270 343-344 353 379 516 537 566 740 
828 927 976 979-994 1092 1153-1159 
1225 1250 


adult spleen 


Clontech 


SPLcOl 


698 859 1042 


stomach 


Clontech 


STO001 


210 238 271-272 537 580 705 918 952 
995 1171 


thalamus 


Clontech 


THA002 


61 219-220 273-276 312 315 330 596 
963 996-1007 1059 1093 1160-1162 


thymus 


Clonetech 


THM00I 


8 120 1 51 208 22 1 316-317 353 639 
750 867 874 878-881 927 963 1023 
1083 1094-1096 1124 


thymus 


Clontech 


THMc02 


8 61 114 129 132 210 225 23 1 306 
317-319 336 340 359 380 398 446448- 
463 512 519 545 554 587 598 698 724- 
725 789 812 836 868 873 927 947 952 
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Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQ ID NOS: 








976 1007 1042 1083 1085 1097-1116 
1 122 1 147 1 177 1226-1229 1234 131 1 
1313 


thyroid gland 


Clontech 


THR001 


14 41 49 76 94 111 144 151 183 188 
210 217 222 253 264 271 277-286 294 
320-326 345-352 361 381-382446 467 
483 514 534 549-550 564 578 602 649 
844 882-883 927 950 956 976 1008- 
1028 1076 1083 1117-1120 1142 1163- 
1175 1230-1238 1308 


trachea 


Clontech 


TRC001 


223-225 238 287 353-354 514 
545 592 611 873 883-884 927 
952 1029-1031 1042 1151-1152 
1170 1176-1177 1239 


uterus 


Clontech 


UTR001 


151 226 288-290 355 537 877 
885-886 976 1001 1032-1033 
1232 



TABLE 2 



SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1 


B02829 


Homo sapiens 


Human G protein coupled receptor hRUP5 
protein SEQ Ii> NO:10. 


460 


100 


2 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: ^645. 


in 


51 


3 


R26173 


Homo sapiens 


Part of Major Yo paraneoplastic antigen 
[L-uKoz.) encoded by clone pYz. 


293 


76 


A 
H 




fiomo sapiens 


^nl^trifn r ~ h n M ^* ! T ^* Q^Txriji f ciTnirnlf 

cmciuin ^noiinci i^-iypc aipna i suoumt 


191 


65 


5 


Y94943 


Homo sapiens 


Human secreted protein clone ytl 4_1 protein 
sequence SEQ ED NO:92. 


251 


50 


6 


Ml 1507 


Homo sapiens 


transferrin receptor 


120 


95 


7 


AF099100 


Homo sapiens 


WD-repeat protein 6 


1941 


93 


8 


Y9233S 


Homo sapiens 


Human cancer associated antigen precursor from 
clone NY-REN -45. 


245 


82 


9 


GO 1343 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5424. 


226 


91 


10 


AJ 133798 


Homo sapiens 


copine VII protein 


1127 


68 


It 


G02449 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6530. 


584 99 


12 


X98330 


Homo sapiens 


ryanodine receptor 2 


282 | 78 


13 


AL024498 


Homo sapiens 


dI417MI4.2 (novel scrine/threonine-protein 
kinase (ortholog of mouse and rat MAK (male 
germ cell-associated kinase)) 


293 


100 


14 


AF045577 


Pan 

troglodytes 


olfactory receptor OR93Ch 


191 


36 


15 


G03131 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7212. 


93 


39 


16 


U26595 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


569 


89 


17 


B08918 


Homo sapiens 


Human secreted protein sequence encoded bv 
gene 28 SEQ ID NO:75. 


99 


44 


18 


Y36203 


Homo sapiens 


Human secreted protein #75. 


165 


75 


19 


U 15647 


Mus 

muscutus 


reverse transcriptase 


106 


40 


20 


GO2701 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6782. 


544 


100 


21 


Y35923 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 172. 


1691 


100 


22 


GO4030 


Homo sapiens 


Human secreted protein, SEQ ID NO: 81 11 . 


380 


96 


23 


G02455 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6536. 


123 


50 


24 


AF036329 


Homo sapiens 


gonadotropin-releasing hormone precursor, 
second form 


284 


90 


25 


GO4067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


96 


32 


26 


S80119 


Rattus sp. 


reverse transcriptase homolog 


100 


34 


27 


U83303 


Homo sapiens 


line-1 reverse transcriptase 


101 


35 


28 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7348. 


135 


45 
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SEQ 


Accession 


Species 


Description 


OllllLll- 


% 


JLU 








FT HlVllHCUJ 


Identitv 


NU: 








Score 




xy 


u\j*mo / 




Human c^rret^H nrntein SEO ID NO - 8148 


83 


42 


30 




Homo sapiens 


nuinan secreicu proiem, any iu iskj. jj ■ 


1 1 6 


72 


31 


G03371 


Homo sapiens 


Human secreted protein, atiV^iiJ rnj. i^dz. 




O / 


-7 -> 

32 


UUjZZ4 


no mo sapiens 


Human o^rr^tprl nrntem SPO XT} MO' 730^ 

nunian secreicu pruLcui, ollk^ llj i^iw* 


JO 


32 


33 


Y66688 


Homo sapiens 


Membrane-Douno. protein rKU i 1 jz. 


9AS7 


QR 


34 


Y87071 


Homo sapiens 


Human secreted protein sequence SEQ ID 












NO:l 10. 






35 


U15131 


Homo sapiens 


pi Zb 


1 fSZ 


to 


36 


Y73464 


Homo sapiens 


Human secreted protein clone y!4 1 protein 




on 








sequence afcQ 1L> fNU-tjU. 






37 


AL133215 


Homo sapiens 


W A 1 MOT '"J J!T / , - 1 1 i-i ■ ■ ■ ■ Af\ f ^l^mAtn 

bAluoLv.o (semapnorin 4U iscma domain, 


OS / 


oo 

77 








immunoglobulin domain (Ig)» transui e m branc 


















38 


AC 067969 


: tt 

amino acids 


Homo sapiens ryanodine receptor 1 (skeletal) 


IBS 


DO 






333o-40oo 








39 


AL031588 


Homo sapiens 


oJ 1 163 Jl. 1 (.mostly supported oy utiNa^AN, 




7fi 








rtjtrJt-o ano LiiiiNjDVvi&Crj 






4U 


Lj03ozcI 


1 : 

Homo sapiens 


Human secreicu protein, ony iu . i / <jy. 


i in 




41 


AF 132969 


Homo sapiens 


l-vj1-33 protein 




Oo 


42 


Y36268 


Homo sapiens 


Human secreted protein encoded by gene 45. 




RS 
oo 


43 


X61048 


Hydra sp. 


mini -collagen 


i rt^ 
1 UD 


JJ 


44 


M76546 


HeJianthus 


hydroxyproline-rich protein 


1 in 


-J J 






annuus 








45 


U82288 


Caenorhabditi 


Rac-like GTPase 










s elegans 








46 


G03477 


Homo sapiens 


Human secreted protein, iu jmo. /ddjs. 


lis 


JO 


47 


AF090942 


Homo sapiens 


r*KU0657 


111 
113 


SI 


48 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7645. 


r\r\ 

y\) 




49 


AJO0556O 


Mus 


SPR2B protein 


72 


JO 






musculus 








50 


G02450 


Homo sapiens 


Kuman secreted protein, zyL\f iu NO. (Oil. 


383 


OS 


51 


Y91649 


Homo sapiens 


Human secreted protein sequence encoded by 


7/5 


OA 








gene 60 ofcQ ID N(J:322. 






52 


U93563 


Homo sapiens 


putative p 1 50 


i f\< 

J ly5 


■50 

JO 


53 


Y55927 


Homo sapiens 


Human STLK2 protein. 




85 


54 


G02607 


Homo sapiens 


Human secreted protein, abQ ID NO: 0600. 


1 A <. 

145 




55 


ABO08175 


Mus 


hepatic nuclear factor 1 -beta short form 


356 


74 






musculus 








56 


M68941 


Homo sapiens 


prote in-ty rosi ne phophatase 


1 

103 


A I 


57 


AL03160U 


Homo sapiens 


ciyoiio. 1 (chloride channel i) 


5^ S 
5 5o 


7r< 

/O 


58 


AF011417 


Mus 


putative pheromonc receptor 


t 4 1 


<< 
jj 






musculus 








59 


AF 1 673zO 


Mus 


zmc tmger protein Lx r 1 1 3 




Do 






musculus 








60 


o7303o 


Homo sapiens 


— : — — 

interferon regultory tactor 7 




OA 


oJ 


AU7984 


Mus 


protein-^rostne kinase 


/ 


AO 






musculus 








OZ 


Y2yool 


Homo sapiens 


Human secrciec proiem cionc cu7o_t. 


701 


98 


63 


U35376 


Homo sapiens 


repressor transcriptional factor 


*o J 


C< 

OJ 






Homo sapiens 


uoicjiiiLin*coiijugaung o 1 rv-tioinajn en^ynic 


rOJ 


74 








A POT I OfJ 






Oj 


IjUjooj 


Homo sapiens 


nurnan secreiea protein, ot^ lu i\u. / jus. 


OO 


95 


66 


AF 177390 


Manduca 


antcnnal specific membrane protein AMP 


")74 


j*» 






sexta 








67 


AB040KW 


Homo sapiens 




fild 
Ol 4 * 


inn 
i *j\i 


DO 




Ecjuine 


24 


213 


26 






herpesvirus 4 








69 


G02965 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7046. 


261 


95 


70 


W75770 


Homo sapiens 


Human oxidoreductase YTF03. 


1144 


98 


71 


AB011135 


Homo sapiens 


KIAA0563 protein 


239 


76 


72 


ABO 14885 


Halocynthia 


HrPOPK-1 


813 


78 






roretzi 








73 


AJ- 045454 


Cavia 


phospholipase B 


955 


73 






porcetlus 








7"4 


J02870 


Mus 


laminin receptor 


308 


61 



105 
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SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 






musculus 








75 


Y00826 


Rattus 
norvegicus 


1 ft f A A 1 1 00£\ 

gpilU (AA 1-lsso) 




H4 


76 


AF117754 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP240 


351 


54 


77 


Y38422 


Homo sapiens 


Human secreted protein. 


468 


76 


78 


Y14596 


Homo sapiens 


Human T-type voltage-gated Ca channel alpha- 
I -1 (nt^av 1 3 ). 


1337 




79 


Y1459I 


Human 
papillomaviru 
s type 68 


APM-1 protein 


767 


100 




ATI noAi 


Homo sapiens 


OJ /yoAiu.z ^jviaau44D protein} 


7 1 
/ 1 


id 


81 


AP0O0383 


Arabidopsis 
thai i ana 


protein arginine N-methyltransferase-like protein 


359 


65 


82 


L46815 


Mus 

musculus 


DNA binding protein Rc 


895 


75 


OJ 


uoiooo 


Homo sapiens 


Human secreted protein, iitxi lL> NU. jooi. 


j I J 


yo 


84 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated i ioCUr-o. 


<1 9 


Tt 


o< 
85 




— — : 

Homo sapiens 


NJAAJU/y protein 


1 1A. 

J 31 




86 


Y2tS67o 


Homo sapiens 


Human cw272_7 secreted protein. 




oz 


87 


Y99368 


Homo sapiens 


Human PR01326 (LTNQ686) amino acid 

cca in xr^^-ior\ 

sequence 5>fc,y IU NU.iuu. 


156 


48 


88 


AJ225124 


X A. .... 

MUS 

musculus 


hy perpo I arizati on -activated cation channel, 
HAC3 


45 / 




89 


AF 1 77203 


Homo sapiens 


cerebral cell adhesion molecule 


z.y\) 


DO 


90 


Y28280 


Homo sapiens 


Human G-protein coupled receptor GRIR-2. 


jZo 


/y 


91 


L39891 


Homo sapiens 


polycystic kidney disease-associated protein 


1 *T« 1 


ys 


92 


AF064876 


Homo sapiens 


ion channel BCNG-1 


953 


99 


93 


AF 1 70723 


Homo sapiens 


protein kinase STK10 


401 


53 


94 


XI 3292 


Trypanosoma 
brucei 


GPI-pnospholipase C (AA 1 - 358) 


151 


37 


95 


Y34I27 


Homo sapiens 


Human potassium channel K+Hnovl 1. 


661 


yy 


96 


X03638 


Rattus 
norvegicus 


sodium channel protein I (aa 1 -2009) 


1775 


92 


97 


AF 13421 3 


Homo sapiens 


ubiquitm-specific protease 


1995 


99 


98 


G00838 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4919. 


213 


38 


99 


AF021935 


Rattus 
norvegicus 


mytonic dystrophy kinase-related Cdc42-bmding 
kinase 


675 


48 


100 


AF279265 


Homo sapiens 


putative anion transporter 1 


867 


no 


101 


AC007878 


Homo sapiens 


match to nuclear protein, NP220; note: sequence 
difference at residue 58 


160 


60 


102 


U22829 


Mus 

musculus 


P2Y purinoceptor 


264 


42 


103 


Y45023 


Homo sapiens 


-— ~ : — ; — 

Human sensory transductioa G-protein coupled 
receptor-B3. 


3 1 1> 


yy 


1U4 


r y^yyu 


Homo sapiens 


Unman «^rwlw4 nmt^in xvKO 1 1 Qt7 C\ TT\^SC\-'J{\ 

numan secrciea proiem voi i i, ocy llj invj.^vj. 


7R7 


7& 




T 5/342 


Homo sapiens 


Human signal peptide containing protein HSPP- 
119SEQIDNO:ll9. 








At 1 lo>J 1 2. 


Homo sapiens 


hepatic angiopoietin-related protein 


0 1 9 


u/ 




AT 1 lOCO / 


Homo sapiens 






S7 


i ns 




Escherichia 
coli 


sialic acid transporter 


Jo I 




109 


Y38395 


Homo sapiens 


Human secreted protein encoded by gene No. J 0. 


693 


100 


i i ri 
1 1U 


I /SoUl 


Homo sapiens 


Hydrophobic domain containing protein clone 

VTPOOfi31 ft mine* arid <M*mience 


1 87 




111 


Z25535 


Homo sapiens 


nuclear pore complex protein hnupl53 


464 


85 


112 


Y94939 


Homo sapiens 


Human secreted protein clone ye90_l protein 
sequence SEQ LD NO: 84, 


274 


51 


113 


AF016365 


Homo sapiens 


hexokinase 1 isoform td 


301 


71 


114 


AC007956 


Homo sapiens 


unknown 


520 


75 


115 


M83738 


Homo sapiens 


protein-tyrosine phosphatase 


251 


92 


116 


AL 157952 


Homo sapiens 


dJ875K 1 5. 1 . 1 (ets homologous factor (ets- 
domain transcription factor ESE-3A, isoform 1)) 


484 


91 


117 


W 18084 


Homo sapiens 


Human Aurora-2. 


546 


87 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 

waterman 

Score 


% 

Identity 


118 


L41816 


Homo sapiens 


cam kinase I 


407 


62 


1 19 


AJOUoViO 


Rattus 
norvegicus 


phosphatidyl inositol 3-kinase 


OZ 1 




120 




Bos taurus 


pyruvate aenyarogenase pnospnaiase regulatory 
subunit precursor, PDPr 


1 (LAfi. 
10*K) 




121 


r\^i 


Homo sapiens 


protein tyrosine phosphatase, PXPase {EC 
3.1.3.48} 


515 


DO 


122 


UoOoU!) 


Homo sapiens 


oncostatin-M specific receptor beta subunit 


ZOZ 


RS 
OO 


123 


Y44403 


Homo sapiens 


Human truncated tankyrase-1. 


l l 1 

111 


-i« 
JJ 


124 


X TOO 1 ""7 

U88167 


Caenorhabditi 
s elegans 


contains similarity to C2 domains 




io 

j.y 


125 


AF300648 


Homo sapiens 


guanine nucleotide binding protein beta subunit 

4 


693 


90 


126 


AB021861 


Mus 

muse u lus 


apoptosis signal-regulating kinase 2 




□5 


127 


AF305210 


Homo sapiens 


concentrative Na+-nucleoside cotransporter 
nCIN 13 


807 


97 


128 


M90360 


Homo sapiens 


protein kinase 


L1X) 


15 


129 


D32202 


Homo sapiens 


alpha 1C adrenergic receptor isoform 2 


5 /4 


So 


130 


AF208043 


Homo sapiens 


IFI16b 


496 


67 


131 


AF201734 


Mus 

musculus 


testis specific serine kinase-3 


SOU 


H / 


132 


AF1 12886 


Bos taurus 


differentiation enhancing factor 1 


159 


74 


133 


AJ278314 


Homo sapiens 


phospholipase C-beta-1 b 


554 


85 


134 


W74802 


Homo sapiens 


Human secreted protein encoded by gene 73 
clone HSQEL25. 


1157 


87 


135 


AB020335 


Homo sapiens 


Pancre as-specific gene j 


668 


96 


136 


W80408 


Homo sapiens 


A secreted proteui encoded by clone dto74_2. 


866 


98 


137 


AC002563 


Homo sapiens 


putative RHO/RAC effector protein; 95% 
similarity to P49205 (PID:gl 345860) 


5041 


99 


138 


Y96736 


Homo sapiens 


PR03434, a novel secreted protein. 


891 


100 


139 


AB024034 


Arabidopsis 
thaliana 


DNA-damage inducible protein DDIl-Iike 


147 


55 


140 


W97809 


Homo sapiens 


Human GTPase regulator GRAF. 


248 


56 


141 


Y51557 


Homo sapiens 


Human PLA2 protein. 


125 


46 


142 


AF090113 


Rattus 
norvegicus 


AMPA receptor binding protein 


623 


93 


143 


W26642 


Homo sapiens 


Human RJECK cancer-inhibiting protein. 


641 


82 


144 


U87306 


Rattus 
norvegicus 


transmembrane receptor UNC5H2 


578 


84 


145 


AF264014 


Homo sapiens 


scavenger receptor cysteine-rich type 1 protein 
Ml 60 precursor 


727 


92 


146 


W63683 


Homo sapiens 


Human secreted protein 3. 


140 


40 


147 


M96264 


Homo sapiens 


galactose- 1 -phosphate uridyl transferase 


513 


D 1 

81 


148 


D64014 


Escherichia 
con 


HrsA 


818 




^ A Ci 


M8JJ lo 


Escherichia 
coli 


_ — . 

pppGpp phosphohydrolase 


Ol < 

J 1 J 




150 


AL 163279 


Homo sapiens 


homolog to cAMP response element binding and 
beta transducin family proteins 


1261 


99 


i j i 


AT 1 I700 f 


riomo sapiens 


o l ii^u-iiKC Kinase 




OO 


152 


R95332 


Homo sapiens 


Tumor necrosis factor receptor 1 death domain 

liirm/4 fi^lnn* ITW/X 

liganci (clone s ivf ). 


392 


61 


153 


AF151859 


Homo sapiens 


CGI- 101 protein 


370 


92 


154 


X66937 


Homo sapiens 


hexokinase type I 




S 1 

ol 


i 

1 jD 




Homo sapiens 


alternatively spliced form 


432 


92 


156 


GO0857 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4938. 


349 


78 


157 


API 59455 


Mus 

musculus 


zinc finger protein 


352 


74 


158 


L76191 


Homo sapiens 


interleukin-1 receptor-associated kinase 


537 


76 


159 


AP001743 


Homo sapiens 


putative gene, ankirin like, possible dual 
specifity Ser/Thr/Tyr kinase domain 


670 


98 


160 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


556 


74 


161 


G02885 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6966. 


370 


100 



107 
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ID 
NO: 


No. 




Fl^cr* r i Tit i o n 


Smith- 
Waterman 
Score 


% 

Identity 


162 


Z22968 


Homo sapiens 


Ml 30 antigen 


610 


100 


163 


AF181121 


Homo sapiens 


ATP-dependent Ca2+ pump PMR1 


336 


92 


164 


AF055636 


Homo sapiens 


leucine-rich gHoma-inactivated protein precursor 


455 


94 


165 


AF 160798 


Rattus 
norvegicus 


calcium transporter CaTl 


700 


96 


166 


Y76332 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 38. 


327 


45 


167 


Y48607 


Homo sapiens 


Human breast tumour-associated protein 68. 


1072 


99 


168 


AB020741 


Mus 

musculus 


NIK-related kinase 


197 


43 


169 


AF252293 


Homo sapiens 


PAR3 


596 


44 


170 


U 59429 


gen. sp. 


diacylglyccrol kinase eta 


481 


82 


171 


AF035268 


Homn ^aniens 


phosphatidyl serine- specific phosphotipase Al 


386 


42 


IT? 


nj vO J 


Mus 

musculus 


scmaphorin cytoplasmic domain-associated 
protein 3B 


507 


82 




V77Q1 R 




T-Ttimftn Qrtfn^t^H nrot^in ^nrjiH-fiH Hv pene NJn 

123, 


653 


99 


1*74 




Motrin cutii^txc 


Human secreted nrnteln_ SEO ID NO* 7060 


538 


97 


1 /J 




Mus 

mti villus 


ciri Dry unit* 5icjn ecu piiur>piiaL<iot~ 


168 


55 


] /o 


W 7JOi7 


nUillU 5><lJ/ltrtK> 


Ufimn ennime Cf»i*r^t^H nrntpin t>ftnp rlnne 

A lUllJlF MljJ l&ll J dWWI Vt-VU V will w< ^ivllv 

gml96 4. 


1022 


100 


177 




HVirnn canfen^ 

1 LVUIU JUL/* wll»> 


fnrmiminotrrnisFera.ee cvclndeaminase form D 

lUl 1 11 J ft 1 J 11 IV LI UU1 V* (UV V T V LkJ U W IU1 MJ ft i CUV *V» * A J 4#** 


255 


93 


178 


X04936 


Homo sapiens 


T-cell receptor alpha-chain (413 is 2nd base in 


710 


99 


179 


AF 127481 


Homo <at>iens 

A Ail KJ JUL/ 1 


non-ocogenic Rho GTPase-spectfic GTP 
exchange factor 


175 


80 


180 


G00978 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5059. 


517 


94 


i R 1 

lot 


I UUVIJ 




TMemhrane-hnurid nrntein PRO1310 


671 


96 


182 


AF 1 10fi4fl 


i-Tomo wni^n^ 

M. 1UJI1U JtALJl 1.3 


orphan seven- transmembrane receptor 


862 


100 






tfiiiriiQ 




766 


84 


184 


AF 169691 


Homo sapiens 


cadherin-like protein VR8 


375 


38 


1 OJ 




Hnmft C5* t^i Hfl ^ 


tMvrntrfirjiTi-r^If^si"\iric hormone deffradinp 

ectoenzyme 


985 


99 


1 oo 




F1UIJJU daJJiCJl> 


rth cn h oH i ^ct pra "M* 


541 


76 


187 


G02920 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7001. 


254 


93 


1 HQ 
IBS 


I 7^1? 1 IS 


Homo sapiens 


HiiTtiflti c^f^r^t^rl nrntrin ^Innf* HH^jQ^l 1ft nrotf in 

"jpnnenre *!FO TT> KO*42 


301 


98 


189 


Y667I3 


Homo sapiens 


Membrane-bound protein PRO1309. 


694 


100 


ion 


nni7dj. 




Human qftrreterl nrntein RFO TD NO* 712*1 


331 


73 


191 


U36771 


Rattus 


sn-glycerol 3-phosphate acylrransferase 


707 


92 


192 


R05935 


Homo sapiens 


Secreted GPIIb subunit of multiple subunit 
DolvMDtide (TVISP'KjPtlb-IIIa. 


157 


72 


193 


M92084 


Theilcria 
parva 


casein kinase 11 alpha subunit 


364 


50 


1Q4 


I OOOhJ 


numu s>apicii3 


^.V^Clll LJl oi IC Ul V Lt-1 11 1 l\w 1 J lUi 


448 


90 


1Q^ 


w yjoj j 


jnumu swipj.cji*> 


HnmA eAnimic c^ , or/*t"^H rsrAt^in c?**rir* f 1 ("Ml 

SK*JJ1C41S* a^»-wl^LC\J !J\h11*w uiuitv^ 

hj 968 2. 


382 


49 


196 


AF255614 


Rattus 
norvegicus 


icaffnldiTin nrotcin SI.IPR 


680 


99 


197 


AC021640 


Arabidopsis 
thai tana 


putative phosphatidate phosphohydrolase 


300 


41 


198 


AF073967 


Mus 

musculus 
domesticus 


olfactory receptor 


316 


43 


199 


W0 1730 


Homo sapiens 


Human G-protein receptor HPRAJ70. 


617 


98 


200 


API 17948 


Homo sapiens 


pancreas-enriched phospholipase C 


625 


89 


201 


API 28625 


Homo sapiens 


CDC42-binding protein kinase beta 


636 


94 


202 


AF 1 17946 


Homo sapiens 


Link guanine nucleotide exchange factor II 


1303 


100 


203 


Y53021 


Homo sapiens 


Human secreted protein clone qc646_l protein 
sequence SEQ ID N0.48. 


701 


99 


204 


AF227968 


Homo sapiens 


SH2-B beta signaling protein 


182 


79 


205 


S81752 


Homo sapiens 


DPH2L-=candidate tumor suppressor gene 


375 


100 



108 
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SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


bmiin- 

W alcilllon 

Score 


Identity 








t ovarian cancer critical region oi aeieuon} 






206 


T 7 1 OIK 

U lis j 1 3 


1 — X j: 

Sus scrota. 


parathyroid receptor 


1 7*7 


fin 


207 


AF255342 


Homo sapiens 


putative pheromone receptor VI RL1 long form 


170 


96 


208 


S52051 


Rattus sp. 


neurotransmitter transporter 






209 


W 63 683 


Homo sapiens 


Human secreted protein 3. 






210 


D79992 


Homo sapiens 


similar to Drosophila photoreceptor cell-specific 
protein, calphotin. 


541 


82 


21 1 


API 17948 


._ 

Homo sapiens 


pancreas-enriched phospho lipase C 


1 348 




212 


U81035 


Rattus 
norvegicus 


ankyrin binding cell adhesion molecule 
neurofascin 


AT t 




213 


AF 154846 


Homo sapiens 


zinc finger protein 


798 


56- 


214 


AF 1 02777 


Mus 

musculus 


FYVE finger-containing phosphoinositide kinase 


ait 


y3 


215 


AL 163303 


Homo sapiens 


putative gene containing transmembrane domain 


523 


89 


216 


U26595 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


563 


7ti 


217 


G04095 


Homo sapiens ~" 


Human secreted protein, SEQ ID NO: 8176. 


644 


98 


218 


X75756 


Homo sapiens 


protein kinase C mu 


3 14 


81 


219 


Y66723 


Homo sapiens 


Membrane-bound protein PRO 1 100. 


770 


98 


220 


D88577 


Mus 

musculus 


Kupffcr cell receptor 


567 


40 


221 


AF258465 


Homo sapiens 


OTRPC4 


853 


100 


222 


AF021935 


Rattus 
norvegicus 


mytonic dystrophy kmase-related Cdc42-binding 
kinase 


636 


96 


223 


AL 136527 


Homo sapiens 


bA215B13.l (A kinase (PRKA) anchor protein 
11) 


693 


100 


224 


AB032417 


Homo sapiens 


WNT receptor Frizzled-4 


690 


99 


225 


AF030430 


Mus 

musculus 


semaphorin Via 


703 


— tTTT" """" 

68 


226 


AEOO0218 


Escherichia 
coli 


putative dihydroxy acetone kinase (EC 2.7. 1 .2) 


297 


39 


227 


AF302150 


Homo sapiens 


phosphoinositol 3-phosphate-binding protein-2 


2080 


100 


228 


AB024573 


Mus 

musculus 


GTP-binding like protein 2 


265 


88 


229 


AF 122924 


Xenopus 
laevis 


Wnt inhibitory factor- 1 


316 


40 


230 


G03205 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7286. 


229 


■ "Taa 

100 


231 


X98260 


Homo sapiens 


M-phase phosphoprotein 1 1 


265 


92 


232 


R92754 


Homo sapiens 


Human growth differentiation factor- 12. 


682 


95 


233 


R75111 


Homo sapiens 


Gly cosyl-phosphati dylinositol-specifi c 
phospholipase-D. 


290 


100 


234 


W69431 


Homo sapiens 


Human secreted protein cwl233_3. 


235 


97 


235 


Y08686 


Homo sapiens 


serine palmitoyltransferase, subunit II 


859 


SI 


236 


AF 118275 


Homo sapiens 


atrophin-related protein ARP 


1 17 


37 


237 


X81466 


Mus 

musculus 


Embryo Brain Kinase 


460 


62 


238 


U64857 


Caenorhabditi 
s cleg an s 


similar to the BPTI/Kunit2 family of inhibitors; 
most similar to tissue factor pathway inhibitor 
precursor (TFPI) 


284 


33 


239 


AJ250840 


Mus 

musculus 


serine/threonine protein kinase 


739 


63 


240 


AJ223472 


Mus 

musculus 


transcription elongation factor TFlIS.h 


222 

.. .. 


3H 


241 


Y94906 


Homo sapiens 


Human secreted protein clone rt>649_3 protein 
sequence SEQ ID NO. 18. 


353 


51 




r\Jr 1 CI7 J\J 1 


ri urn u sapiens 


KIo-4-/cii1f!>t^ mtrsmcnnrtrr CT fT-l 
1t( trr/ Alii late iaju omijt 11 I I 


591 


99 


243 


L22022 


Rattus 
norvegicus 


orphan transporter v7-3 


667 


93 


244 


AF016191 


Rattus 
norvegicus 


potassium channel 


1043 


98 


245 


AF097366 


Homo sapiens 


cone sodium-calcium potassium exchanger 


645 


98 


246 


Y29868 


Homo sapiens 


Human secreted protein clone pp325_9. 


497 


98 


247 


AF 180475 


Homo sapiens 


Not4-Np 


188 


83 


248 


Y 1 7227 


Homo sapiens 


Human secreted protein (clone yal-1). 


690 


99 


249 


AF250910 


Manduca 


death-associated small cytoplasmic leucine-rich 


182 


31 



109 
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scxta 


pfOteUl &L.Lr 






250 


AF 192756 


Kaposi's 


Ort73 


1 1 A 


J4 






sarcorna- 












fl ecnr i n f ^ rl 
055UV.la.lCu 












1151 ^/CO V 11 U& 








-£J I 




Homo sapiens 


lvlkJpl. prOCClil fLliuuC 










nomu sapiens 






ion 


233 


T y4 £ E> 1 < 


MUS 


DNA binding protein Rc 


ZD t 


0/ 






musculus 










WoSjUj 


Homo sapiens 


riuman a.ciu sensing ionic cnannci. 


1 TX 


87 


^133 




Mus 


Citron-K kinase 


1ZU 1 


Ofi 






musculus 












tionio sapiens 


iiuman accreieu protein, acy iu inu. oj/z. 


dsn 


i nn 


257 




Oryctolagus 


Phospholipase 


JO 6 


su 






cuni cuius 








ICO 

258 




Homo sapiens 


riuman calcium channel bU(~-J/(-KAL--Z- 


1 0<"T 
f S3 / 


tV 






Mus 


L-periaxin 


A1f\ 
4JU 


/Z 






musculus 








260 


AJ250839 


Homo sapiens 


. .... 

serine/threonine protein kinase 


ool 


1 Ark 


261 


AJ249977 


Homo sapiens 


AMP -activated protein kinase gamma 3 subunit 


758 


Vo 


262 


Ar 1415&D 


Rattus 


iLl 1 -2 


1V5 








norvegicus 









263 


a r/iTi of n 

AF022859 


Homo sapiens 


neuropilin-2(a0) 


> ^ 

335 






Art £A>fT7 

ArloW/ / 


Homo sapiens 


Ig supcrfamily receptor LN1R precursor 


JO A 


Ol 


265 


Y44662 


Homo sapiens 


Human 14273 G-protein coupled receptor 


636 


99 








(GPCR). 






— ^ 1 

266 


U27269 


Mus 


T i 7" ~T 

sodium glucose cotransporter 


204 


56 






musculus 








267 


AF 124491 


Homo sapiens 


ARF GTPasc-activating protein GIT2 


159 


75 


268 


AF 127389 


Rattus 


putative taste receptor TR1 


209 


39 






norvegicus 








269 


X98296 


Homo sapiens 


ubiquitin hydrolase 


215 


95 


270 


X78482 


Streptococcus 


Fc-gamma receptor 


129 


26 






pyogenes 








271 


AB 009883 


Nicotiana 


KFJD 


109 


26 






tabacum 








272 


AF 137367 


Mus 


VPS10 domain receptor protein SORCS 


899 


97 






musculus 








273 


L34938 


Rattus 


ionotropic glutamate receptor 


460 


86 






norvegicus 








274 


AL022724 


Homo sapiens 


dJ413H6.1.1 (hamster Androgen-dependent 


188 


74 








expressed rrotcin Lltvb ru l A 1 i vt, protein) 












(isoform I) 






J. ID 




— — ; 

Homo sapiens 


ubiquitin-conju gating BIR-domain enzyme 


i ij 


QA 














/D 




nomo sapiens 


Unman a*rr**imA nrnfpin CCH [11 WO- AO^I 

iiuiiiu-ii scurcicu proicin, iu r*\j. oitjj. 


1 ASt 


JO 


2Y / 


JL4UJoU 


Homo sapiens 


thyroid receptor interactor 


4JU 


Ol 


Z f 6 


Ai3U4oS3 1 


Homo sapiens 


MAAioJi protein 


iSJ 


QA 

70 


2/y 


ALUU&U73 


Arabidopsis 


on tains fr|WJoy bulcaryotic protein Kinase 


1 3 / 


A 1 






thai i ana 


domain. 






^ Jin 


MBJ / Jo 


Homo sapiens 


proLein-iyrusinc pnospnaiasc 


to l 




9R1 
Z5I 


4 V fW AVY7 


Homo sapiens 


unnamed protein product 




if 1 


TOO 


At* 14-1 .JZO 


Homo sapiens 


kin a nei icase jiUd/juil-c. i 


/tOT 


flA 


OC1 


Ar 13o33U 


MUS 


1 1 cwiomain transcnpuonai repressor rt, i 


OU3 


/O 






musculus 








284 








647 


100 








reading frame protein. 






285 


Y73402 


Homo sapiens 


Human secreted protein clone yc25_l protein 


300 


90 








sequence SEQ ID NO:26. 






286 


AF01641 1 


Homo sapiens 


KCNA3.1B 


137 


100 


287 


W89253 


Homo sapiens 


Human ALP. 


688 


97 


288 


AF112886 


Bos taurus 


differentiation enhancing factor 1 


750 


96 


289 


AF113131 


Homo sapiens 


host cell factor homolog LCP 


367 


44 


290 


U52111 


Homo sapiens 


plexin-related protein 


698 


100 


291 


AF026504 


Rattus 


SPA-1 like protein pl294 


603 


89 
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norvegicus 








292 


AF 102854 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


124 


53 


293 


X99211 


Drosophila 
metanogaster 


ubiquitin-specific protease 


143 


38 


294 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4_l protein 
sequence ofctj il> r*\j.y£. 


lay 


OA ' 


tax 




Homo sapiens 


Human protein clone tiruz/ys. 


i no 


jy 


296 


ArUiy/oV 


Homo sapiens 


zinc finger protein 


i </i 
1 54 


o<; 
yo 


297 


Y2e56o 


Homo sapiens 


Secreted peptide clone ba 57 /_ 1 . 


5oo 


84 


298 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4_l protein 
sequence SEQ ED NO:92. 


182 


97 


299 


B08906 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 1 6 MiQ ID NO:63. 


605 


69 


300 


R58890 


Homo sapiens 


Human-32 cadhcrin-rclated molecule. 


212 


97 


301 


AF022859 


Homo sapiens 


neuropil in-2(a0) 


111 


100 


302 


Y71124 


Homo sapiens 


Human mitogen ic regulator duox2. 


716 


97 


303 


Y44297 


Homo sapiens 


Human receptor tyrosine kinase. 


228 


97 


304 


D32050 


Homo sapiens 


alanyl-tRNA synthetase 


192 


80 


305 


U43586 


Homo sapiens 


protein kinase related to Raf protein kinases; 
Method: conceptual translation supplied by 
author 


428 


72 


306 


R54872 


Homo sapiens 


Human H13 viral receptor mutant 4. 


280 


95 


307 


D78572 


Mus 

musculus 


membrane glycoprotein 


199 


41 


308 


AF255614 


Rattus 
norvegicus 


scaffolding protein SLIPR 


639 


88 


309 


S79463 


Mus sp. 


semaphorin homolog^M-Sema F 


162 


89 


310 


AF 178941 


Homo sapiens 


ATP-binding cassette sub-family A member 2 


736 


100 


311 


U03413 


Dictyostelium 
discoideuin 


calcium binding protein 


151 


36 


312 


Y87347 


Homo sapiens 


Human signal peptide containing protein HSPP- 
124 SEQ ID NO:124. 


744 


100 


313 


Z97055 


Homo sapiens 


d!388M5.4 (putative GS2 like protein) 


789 99 


314 


AC004010 


Homo sapiens 


similar to Leucine-rich transmembrane proteins; 
44% similarity to U42767 (PID:gl736918) 


197 | 38 


315 


AL02I392 


Homo sapiens 


dJ439F8.2 (supported by GENSCAN and 
GENE WISE) 


278 


38 


316 


U70209 


Mus 

musculus 


polycystic kidney disease 1 protein 


165 


38 


317 


AF 109643 


Rattus 
norvegicus 


coxsackie-adenovirus-receptor homolog 


223 


38 


318 


AF 104923 


Homo sapiens 


putative transcription factor 


138 


84 


319 


AF 100287 


Trypanosoma 
vivax 


activated protein kinase C receptor homolog 


141 


38 


320 


G00588 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4669. 


125 


51 


321 


Y21591 


Homo sapiens 


Human secreted protein (clone CC332-33). 


459 


97 


322 


D26070 


Homo sapiens 


human type 1 inositol 1,4,5-trisphosphate 
receptor 


232 


97 


323 


Y27918 


Homo sapiens 


Human secreted protein encoded by gene No. 
1 15. 


306 


88 


3<i4 


ArUJUI44 


Homo sapiens 


neuronaJ tnread protein AD /q-is l r 


209 


70 


325 


Ml 9650 


Homo sapiens 


T.S'-cyclic-nucleotide 3'-phosphodi esterase (EC 
3.1.4.37) 


214 


97 


326 


W80396 


Homo sapiens 


A secreted protein encoded by clone bp646_10. 


140 


70 


52. 1 


A fJ /50 


Homo sapiens 


protein kinase C mu 


540 


78 


328 


G02292 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6373. 


721 


99 


329 


AF 1 68990 


Homo sapiens 


putative GTP-binding protein 


877 


99 


330 


S67984 


Homo sapiens 


anti-HIV gp!20 antibody heavy chain variable 
region 


581 


80 


331 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA -19to4525) 


2823 


98 


332 


Y87330 


Homo sapiens 


Human signal peptide containing protein HSPP- 
107 SEQIDNO:107. 


1127 


100 


333 


Y28503 


Homo sapiens 


HGFH3 Human Growth Factor Homologue 3. 


320 


98 


334 


AC002563 


Homo sapiens 


putative RHO/RAC effector protein; 95% 


327 


93 



111 
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similarity to r4y2Uj {rlD.gl J^dsoUj 






335 


Y87347 


Homo sapiens 


Human signal peptide containing protein HSPP- 

i24 bJty ID !NU.1^4. 


mi 


67 


336 


AF006466 


Mus 

musculus 


lymphocyte specific formin related protein 


193 


75 


337 


AF265555 


Homo sapiens 


ubiquitin-conjugating B1R- domain enzyme 


632 


97 


338 


Y 13443 


Homo sapiens 


Amino acid sequence of hSlo3-2. 


516 


100 


339 


Y07637 


Homo sapiens 


putative GABA-gated chloride channel 


lay 




340 


Y05734 


Homo sapiens 


Human Grb7 effector 2.2412 protein. 


2156 


99 


341 


AE000497 


Escherichia 
coli 


L-idonate transcriptional regulator 


928 


98 


342 


D90855 


Escherichia 
coli 


glycerol-3 -phosphate dehydrogenase (EC 
1 . 1 .99.5) chain A, anaerobic 


769 


99 


343 


D85613 


Escherichia 
coli 


membrane component 


399 


100 


344 


M93239 


Escherichia 
coli 


transmembrane protein 


232 


100 


345 


M60177 


Escherichia 
coli 


enterobactin 


759 


99 


346 


D90699 


Escherichia 
coli 


Sensor protein copS (EC 2.7.3.-). 


638 


97 


347 


D90843 


Escherichia 
coli 


CapB protein. 


552 


100 


348 


Ml 3422 


Escherichia 
coli 


49 kd protein 


1193 


96 


349 


L10328 


Escherichia 
coli 


similar to drug resistance translocases 


340 


90 


350 


X69942 


Mus 

musculus 


enhanccr-trap-Iocus- 1 


560 


82 


351 


AF239613 


Homo sapiens 


aparoin-sensitive small-conductance Ca2+- 
activated potassium channel 


463 


80 


352 


D90777 


Escherichia 
coli 


3-hydroxybutyryl-CoA dehydrogenase (EC 
1 . 1 . 1. 1 57) (b- hydroxybutyryl-CoA 
dehydrogenase) (BhbD). 


577 


100 


353 


D90863 


Escherichia 
coli 


similar to 


311 


98 


354 


Y52386 


Homo sapiens 


Human transmembrane protein HP02000. 


133 


58 


355 


Y31645 


Homo sapiens 


Human transport-associated protein-7 (TRANP- 
7). 


482 


55 


356 


Y58637 


Homo sapiens 


Protein regulating gene expression PRGE-30. 


119 


51 


357 


API 19226 


Homo sapiens 


dual-specificity tyrosine phosphatase YVH1 


I too 


inn """ " 


ICO 

338 


Yo72iy 


Homo sapiens 


Human secreted protein sequence SEQ ID 


ID J 


1 AA 


3jy 


JUU 1 3^ 


Homo sapiens 


beta-fi brinogen 






3ou 


/""AITOA 

(jU3 /dV 


Homo sapiens 


Human secretea protein, il> /o/u. 


1 9ft 


^n 
/u 


361 


R28916 


Homo sapiens 


Type HI procollagen (prior art). 


108 


40 


302 


Til < K 


Rattus 
norvegicus 


phospholipase C delta-4 




OJ 


jW 


/-•r\i no 
UU3 1 1 y 


Homo sapiens 


Human secrete o protein, aii^ il> /zuu. 


o'< ' 


dO 


3t>4 


U4 /o 


Gall us gall us 


chicken brain facto r-2 




j** 


363 


CJU37S9 


Homo sapiens 


Human secreted protein, abij LU NU. /a/u. 


1 01 

1 113 


OJ 


Joo 


otwuvi 


Homo sapiens 


T 1 . , . ~ ^ ,m. A CCA T FA VIA- Q 1 "TO 

Human secreted protein, bcK^ ID nu. oi fJ., 


1 1 1» 




367 


X98258 


Homo sapiens 


M-phasc phosphoprotcin 9 


564 


75 


368 


AL021 366 


Homo sapiens 


CICK.0721Q.3 (Kinesin related protein) 


3387 


OA 


3oy 




Peromyscus 
leucopus 


reverse transcriptase 


Q9 




370 


X86400 


Homo sapiens 


gamma subunit of sodium potassium ATPase 
like 


242 


73 


371 


G03172 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7253. 


165 


56 


372 


U49974 


Homo sapiens 


manner transposase 


257 


55 


373 


X13916 


Homo sapiens 


LDL-reccptor related precursor (AA -19 to 4525) 


21193 


99 


374 


AF234765 


Rattus 
norvegicus 


serine-arginine-rich splicing regulatory protein 
SRRP86 


1182 


78 


375 


U49974 


Homo sapiens 


mariner transposase 


172 


55 



112 



Printed from Mimosa 03/03/06 11:09:23 Page: 113 



WO 01/57188 



PCT/U SO 1/03800 



SEQ 


Accession 


Species 


Description 


bmitn- 


70 


ID 


INo. 






Waterman 


Identity 


NU. 








Score 




j to 


ou iyo4 


— — : 

Homo sapiens 


Human secretea protein, brA^ id invj . ouoj. 


ZZl 


D / 


ill 


rvin^o 
vjuuooy 


Homo sapiens 


Unman o«/*rfll«^ nmtotn CTP/"l TT"1 XlfV Al^d 

tiurnan secreted proicui, oCv<f iu rsvj. t/ju. 




inn 


J 10 






vj i jt uinaing protein 


t A^fi 


Q1 

7 1 














J iy 


IV07U7J 


fuJIUO -j aplCIl^ 


A ntLHIU Pah tst 1 1 lirrht rhain 

i fc VilLl — XIJ V roil LdU 1 IIIUll. ^Mdlll. 




37 




104074. 


nuiuvi 




XdZ. J 


37 


Jo 1 




Hnmrt ca ni tit c 

rajmu aapidia 


T AK-An 


jji/ 


*t j 


H17 


I \ftA9.~\v\ 


uiciyiAiciiujn 


proicjii tyrosine Kinase 


I 1 s 

I I j 


44 






H i ^ta i ii **i im 








383 


G02916 


VTrtTnrt QnnipnQ 
M.l\Jlxl\J QaUlvlu 


Human "secreted nrntein ^TFO IT) "NO- 69Q7 


618 


98 


IRA. 
Jot 




Homo sdpi ens 


Human ^m-iW^H ni-nt^in <3PH TPl WCV ^77^ 
XlUIllall 5CCICICU pruiClii, IU rv\S. Ji / J. 


fil 7 


7 J 


Jo J 


A T7d^R77 


riomtj sapiens 


type i urarisiiieinui aiic recepior 




i fin 

1 uu 


JoO 


UOD7 /** 


nomu sapiens 


VIA AA770 


71 iR 

i 1 T-O 


051 

70 


187 

Jo / 


0/11701 


Homo sapiens 


Human secretea protein, bin^ iu inw. /zo4. 


1 d7 
14/ 


J\J 


ICC 
JOO 


VjU4U/Z 


Homo sapiens 


Human secreted protein, iu fhj. o i jJ. 


no 

yv 


<Q 

Dy 


joy 


Ml 2 140 


Homo sapiens 


envelope protein 


iy / 


jl 


JVU 


AJ2933\)9 


Homo sapiens 


NHP2 protein 


461 


/ / 


391 


Y4275 1 


Homo sapiens 


Human calcium binding protein 2 (CaBP-2). 


1 O 1 

lol 


y4 


392 


W48351 


Homo sapiens 


Human breast cancer related protein BCRB2. 


241 


66 


393 


Y 14442 


Homo sapiens 


olfactory receptor protein 


339 


54 


394 


W85607 


Homo sapiens 


Secreted protein clone da228_6. 


957 


100 


395 


Y76332 


Homo sapiens 


Fragment of human secreted protein encoded by 


171 


34 








gene 38. 






396 


G03930 


Homo sapiens 


Human secreted protein, SEQ ID NO: 80 1 1 . 


250 


100 


397 


AB032904 


Hylobates 


dopamine receptor D4 


105 


35 






syndactylies 








398 


AJ007798 


Homo sapiens 


stromal antigen 3, (STAG3) 


861 


85 


399 


Y91405 


Homo sapiens 


Human secreted protein sequence encoded by 


1047 


92 








gene 2 SEQ ID NO: 126. 






400 


Y29861 


Homo sapiens 


Human secreted protein clone cb98_4. 


162 


37 


401 


D87002 


Homo sapiens 


similar to rat integral membrane glycoprotein; 


527 


78 








accession number Z21513. 






402 


AF 100754 


Homo sapiens 


ancient ubiquitous protein AUP1 isoform 


853 


95 


403 


X74904 


GflJlus gallus 


alpha- 2-macroglobulin receptor 


258 


60 


404 


AF075462 


Mus 


ADP-ribosylation factor-directed GTPase 


545 


89 






musculus 


activating protein isoform b 






405 


X92887 


Human 


pol/env 


162 


30 






endogenous 










retrovirus K 








406 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 hDRR4. 


325 


72 


407 


AK022626 


Homo sapiens 


unnamed protein product 


2833 


99 


408 


L 13802 


Homo sapiens 


ribosmal protein small subunit 


264 


92 


409 


Y91600 


Homo sapiens 


Human secreted protein sequence encoded by 


1788 


89 








gene 9 SEQ ED NO:273. 






410 


W88745 


Homo sapiens 


Secreted protein encoded by gene 30 clone 


2004 


99 








HTSfc.V09. 






A 1 1 


A±JU4jy53 


MUS 


L,nat-H 


Zola 


oZ 






musculus 








a n 
4 \L 


YSOZ33 


Homo sapiens 


Human secreted protein hin i ivlAzy, sty id 


|A1i 

1U 14 










fvO.148. 






/ill 
4 I j 




ran 


MhL class 1 A 


^lOJ 


/ 1 






troglodytes 










At* 1 jjWi 


Homo sapiens 


in i -tvtjN- / antigen 


SjU 


yj 


415 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


88 


48 




i j /yi l 


Homo sapiens 


Human transmem orane protein H i jVLriN-j->. 


zoo 


so 
oy 


417 


W27651 


Homo sapiens 


Secreted protein AT205. 


481 


60 


418 


Y76884 


Homo sapiens 


Retinoblastoma binding protein-7sequence. 


3077 


87 


419 


AF2S5559 


Notothenia 


alpha tubulin 


289 


68 






coriiceps 








420 


GO 1984 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6065. 


209 


74 


421 


AL 109827 


Homo sapiens 


dJ309K20.2 (acrosomal protein ACR35 (similar 


1446 


96 








to rat sperm antigen 4 (SPAG4))) 






422 


AC008075 


Arabidopsis 


F24J5.4 


112 


35 






thaliana 









113 
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Accession 


Species 


is Cocnpi 1 on 




0/ 

so 


ID 


[NO. 






w merman 


lucmiiy 


JN(J. 








Score 




47^ 


AF71 1705 


XlXJl Jlu 3ULJi£jL 1 J 


A In rn-r**rtr#»c^nr 1 


1090 


100 


474 


AR7^48R7 


noniu >*ipicnb 


ft a Vyf nvjno 1 


UiDfl 


07 

y i 


47^ 


Y3 5942 


nuniu 3 au it-ii j 


T*vt^nfi^j"l hiimfin ^^^Ttp4^H nmf^m g^-n ■ 1 t*t\r***- f^i 

DnLbUUCU [lUUulil TvUvlvU pUJltlll JVUUWllA, <JL^K^ 


* 7U 1 


99 








11-/ 1NVJ T lyl. 
















98 


427 


LI 23 92 






16080 


99 


47R 


Y94990 




Humeri vrr^trH nrnt^in vK?1 1 ^FO TO KO'?0 


768 


98 


47Q 


A 1793 ^7"* 


nUlllU zkIJJICIIj 




542 


87 


4?fl 


I O^HM I 


Ur|m i\ CO n i an C 


jA rn inn n i rl c^fi i i j* n r*j* nf 4 tin m 9 n tc M A * 
rvlIilHU oClLl 3A?l^LICIlwC Ui a 11 111 il all IVl tjt.' 


2074 


100 








accriiHatp-fl nrnt^in 






4T1 




ILVJIJIU «>C1. LJ J t*llj 


1-TiimAn c^rr^rf^H nrnt^tn QFO IT) MO' £01 1 

£j.|yllllull jCvlClGU UlUlUlll! OJj ll^i l^i . \J*IJ 1, 


723 


95 


432 


G04067 


Wnrnn csa rn#*n , v 


Human sftrrptwl nrntHn SFO JT> NO' A14Jt 


73 


42 




AF1 ^Q7Qfi 


L>y LJjpCiS 1 MJli 


cx lci isi n - 1 like pruicin 


Ul J 


4R 














434 


W48351 




f-TiimAn hirmct rsnf^r rplat^H nrntrin Ar^T?R9 


135 


44 


41 5 


V71874 




pLlU^pilUI y laoWr rwlllaot 


"*447 


97 


436 


AF1 61476 




I I JI < — -J 


268 


74 


437 


Y30812 


T-frvmri QAnipnQ 

I lt\J 3<XJJl(^liO 


Humjirt c^^r^tftH nrntfMn ^tif*nH#*H "frAm o^n** 7 

IIUIUCUI Ul UiCJll Ctl^UVJ^J XI Villi " 


1055 


52 


41R 


r,n^7QR 


l-i ft™ o en ni^nc 
nwinu M&piCiLb 


PTumian c^rri>t^H nrnt^in CPO TT1 "KJO- 7ft7Q 

nuniaji occrcLCu pruicin^ oilv/ iu r<\j. jo / y. 








A. 1 *t / OO 


numu ^apiciLb 


A 13 A _ A n /HAnt/ti* a 1 rth 9 1 ci ill! tri i ^ 

u/ujft'A rcucpiur aipna i buuunii 


2294 




440 


X02344 








95 


441 
*Hr J 


/\r 1004io 


Homo sapiens 


Qctivaiing signoJ cointegnitor 1 


1 8H7 


■ inn 




I 1 1 fi77 


Homo sapiens 


zinc finger protein 


70S 


■^/i 

->4 


A At 




Homo sapiens 


unman secreteo protein, oti/ id nu. /za4. 


7J 


7A 


AAA 

444 


A5.Z 14U 


unidentified 


HUMAN NL)K 


C 1 

Z43 L 


1UU 


AA< 




Homo sapiens 


ryanodine receptor 2 


yj 3D 


QO 


440 




Homo sapiens 


rKUi /Jo 


£.1.1 


4y 


447 


AF245447 


Homo sapiens 


sphingosine kinase type 2 isoform 


576 


99 


448 


API 11 AOiC 

Ar liJUao 


Homo sapiens 


membrane-type serine protease 1 


2630 


94 


449 


U87305 


Rattus 


* l • y |vT/^ pill 

transmembrane receptor UNC5H! 


817 


93 






norvegicus 








450 


ArUH I24y 


Homo sapiens 


JAW! -related protein MKVI I A long isoform 


45os 


yy 


451 


AC005498 


Homo sapiens 


R31665 1 


316 


62 


AC\ 

452 


M60235 


Homo sapiens 


granule membrane protein- 140 


A £ A 

464 


73 


453 


ABO 3 6706 


Homo sapiens 


intelectin 


730 


88 


454 


G00918 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4999. 


263 


81 


455 


Y22634 


Homo sapiens 


Human cytokine inducible regulatory protein- 1 


192 


67 














456 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 


106 


40 








gene 62. 






457 


N91325 


Homo sapiens 


DNA encoding human growth hormone receptor. 


3282 


96 


458 


M19155 


Plasmodium 


S-antigen precursor 


1 10 


36 






falciparum 








4?y 




Homo sapiens 


Ammo acid sequence of protein PR0257. 




ys 


460 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 


149 


43 








clone H l jJALT/V 






At. 1 

461 


Y I 4482 


Homo sapiens 


Fragment of human secreted protein encoded by 


184 


34 








gene i /. 






40.^ 




™ : 

Homo sapiens 


Human secreted protein clone pm749_8 protein 


1 Ji 


4 / 








sequence atixi iu Ntj.io. 






4f^ 


V94QAA 


_-- , 

Tnbcum 


low molecular weight gluten in 


1 AO 








aestivum 








4/^4 


vt i yy i " 


Homo sapiens 


— — — — — —— — 

Human cCsr-1 (kinase suppressor or Ras). 


1 781 




4fi*; 


A P 1 HQ7fi/i 
rtr ioJ'/04 




alpha/beta hydroJasc-1 


















466 


U93569 


Homo sapiens 


p40 


101 


30 


467 


Y41528 


Homo sapiens 


Fragment of human secreted protein encoded by 


1172 


99 








gene 77. 






468 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


149 


52 


469 


AJ000008 


Homo sapiens 


PI3 -kinase 


5832 


97 


470 


X70922 


Mus 


neurotoxin homologue 


118 


47 






musculus 








471 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


198 


75 


472 


Y36705 


Homo sapiens , 


Fragment of human secreted protein encoded by 


72 
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SEQ 
ID 

WU. 


Accession 

[NO. 


Species 


Description 


Smith- 

Waterman 

ocurc 


% 

Identity 








gene 62. 






47,5 


nmi 1 1 


Homo sapiens 


Human secreted protein, iifcy iu iNU. ojy*. 




1 UU 


4 /4 


Vn*7(VT7 
1 U / IAJ / 


Horno sapiens 


Breast cancer associated antigen precursor 
sequence. 


inn 

1U1J 


Q"7 


4 / D 




Homo sapiens 


Human toKr i protein. 


OA 1 


oU 


4 /o 




Homo sapiens 


numan or east cancer reiaieu protein id>^i\jdz. 




bj 


477 


i uzoyj 


Homo sapiens 


Human secreted protein encoded by gene 44 

c| nns UTTtA mi 

L.iune ri i uajj/Z. 




ou 


4 /o 


fin i sin 


— — : 

Homo sapiens 


numan secreted protein, o*n^ i~u rvu. jvoi. 




1 UU 


479 


AF 102777 


Mus 

musculus 


FYVE finger-containing phosphoinositide kinase 




92 


48 U 




Homo sapiens 


rium an secreted protein, bey uu rsu. /ijj. 


1Z3 




jgi 

481 


Ws/ /U] 


Homo sapiens 


A human membrane fusion protein designated 

QVTAVl 






/ICO 


UuJ 117 


— : 

Homo sapiens 


numan sccreico protein, oiiy llj t^w. /zuu. 


111 
1 J I 


jy 


AOI 
48J 


nr Z 1U0D 1 


Homo sapiens 


INALrlo 




jy 


AQA 
484 


A C/l i n 1 A A 


Homo sapiens 


neuronal thread protein j\xj / q-s\ i r 


1A1 


jU 


485 


G00637 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4718. 


129 


70 


406 


U1 5 174 


Homo sapiens 


BCL 2/ adenovirus E1B 19kD- interacting protein 
3 


i a a 


15 


487 


Y76167 


Homo sapiens 


Human secreted protein encoded by gene 44. 


627 


100 


4ss 


Aj2>Id2\3 


Homo sapiens 


stabilin-1 


1244 


n i 
91 


489 


Cj(J37VS 


Homo sapiens 


Human secreted protein, SbQ 1U NO: 7879. 


313 


65 


490 


LI 2392 


Homo sapiens 


Huntington's Disease protein 


16081 


100 


491 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


197 


66 


492 


J03799 


Homo sapiens 


laminin-binding protein 


228 


70 


j 493 


UI5174 


Homo sapiens 


BCL2/adenovirus E1B 19kD-intcracting protein 
3 


128 


41 


494 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


197 


67 


495 


AC005175 


Homo sapiens 


not A At\ *i 

R31449 3 


889 


94 


496 


Q03786 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7867. 


229 


61 


497 


AB030237 


Can is 
familiaris 


D4 dopamine receptor 


90 


48 


4ys 




Homo sapiens 


Human secreted protein, afcU. 1IJ NU. bySJ. 


228 


05 


499 


U70935 


Peromyscus 
maniculatus 


reverse transcriptase 


213 


52 


500 


U4S508 


Homo sapiens 


skeletal muscle ryanodine receptor 


26406 


99 


501 


G03371 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


105 


58 


502 


AF 119851 


Homo sapiens 


PRO 1722 


156 


62 


503 


Arl 13685 


Homo sapiens 


PRO0974 


1 16 


50 


504 


U79458 


Homo sapiens 


WW domain binding protein-2 


322 


59 


505 


W29651 


Homo sapiens 


Human secreted protein CD124_3. 


608 


55 


506 


W85459 


Homo sapiens 


Secreted protein encoded by clone dhl 135_9. 


986 


70 


507 


Y 862 65 


Homo sapiens 


Human secreted protein HUhXit J/, z>b,Q ID 
NO: 180. 


1 15 


— y= 

33 


DVO 


A f 1 £.f\ 1 


Homo sapiens 


bA^43J 1d.J (similar to MYLK ^myosin, iignt 
polypeptide kinase)) 


184 


yl 




U4336U 


Peromyscus 
maniculatus 


reverse transcriptase 


97 


oz 


31U 




Homo sapiens 


Human secreted protein, t>t,(j LL> mu. /o/u. 


1 1 T 
1 I / 




"si 1 




Homo sapiens 


Human secreted protein dn740_3. 


1 f\KQ 


1UU 


■^n 

JlZ 




Homo sapiens 


neuronal thread protein AD7c-NTP 


ZVjj 




513 


AJ133439 


Homo sapiens 


GRIP1 protein 


2151 


100 


Jit 


AE.Wjt JO 


Drosophila 
melanogaster 


gene prouuci 






515 


Z 1 7206 


Xenopus 
laevis 


p46X:Eg22 


128 


40 


516 


AF104413 


Homo sapiens 


large tumor suppressor 1 


1766 


94 


517 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


92 


40 


518 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


519 


S80864 


Homo sapiens 


cytochrome c-like polypeptide 


318 


50 


520 


X92485 


Plasmodium 
vivax 


pval 


170 


61 
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Accession 


Species 


Description 


Smith- 


% 


ID 


No. 






Waterman 


Identity 


NO: 








Score 




521 


G03790 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7871. 


159 


59 


522 


AF 12 1857 


Homo sapiens 


sorting nexin 7 


259 


40 


523 


G02654 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6735. 


82 


37 


524 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 


253 


73 








HPMBQ32. 






525 


AF1 19851 


Homo sapiens 


PRO 1722 


162 


57 


526 


Y27761 


Homo sapiens 


Human secreted protein encoded by gene No. 47. 


154 


57 


527 


G02707 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6788. 


70 


45 


528 


U47924 


Homo sapiens 


C8 


1112 


86 


529 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


84 


45 


530 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


111 


60 


531 


G04067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


92 


65 


532 


G03267 


Homo sapiens 


Human secretedprotein, SEQ ID NO: 7348. 


75 


29 


533 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


182 


48 


534 


AF068286 


Homo sapiens 


HDCMD38P 


861 


100 


535 


U07707 


Homo sapiens 


epidermal growth factor receptor substrate 


228 


60 


536 


GO 1955 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6036. 


484 


75 


537 


AF219232 


Gallus gallus 


q in- induced kinase 


206 


53 


538 


AF 135022 


Homo sapiens 


mediator 


128 


100 


539 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 734S. 


141 


59 


540 


AFO 16430 


Caenorhabditi 


contains similarity to a BR-C/TTK domain 


853 


39 






s elegans 








541 


AC003093 


Homo sapiens 


OXYSTRROI ,-BINDING PROTEIN; 45% 


408 


66 








similarity to P22059 (PID:gl 29308) 






542 


M29487 


Homo sapiens 


integrin alpha subunit precursor 


517 


81 


543 


API 02530 


Mus 


olfactory receptor F3 


327 


73 






musculus 








544 


Y73431 


Homo sapiens 


Human secreted protein clone ybl86_l protein 


386 


100 








sequence SEQ ID NO: 84. 






545 


AE004833 


Pseudomonas 


probable TonB-dependent receptor 


279 


42 






aeruginosa 








546 


G03793 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7874. 


264 


53 


547 


Y69192 


Homo sapiens 


A human monocyte-macrophage apo lipoprotein 


1772 


67 








B receptor protein. 






548 


Y91493 


Homo sapiens 


Human secreted protein sequence encoded by 


176 


LOO 








gene 43 SEQ ID NO: 166. 






549 


GO 1 571 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5652. 


777 


99 


550 


AF044588 


Homo sapiens 


protein regulating cytokinesis 1 ; PRC1 


1953 


88 


551 


Y29332 


Homo sapiens 


Human secreted protein clone pe584_2 protein 


1224 


94 








sequence. 






552 


X98330 


Homo sapiens 


ryanodine receptor 2 


24621 


99 


553 


Y42782 


Homo sapiens 


Human UC Band #33 1 protein. 


684 


95 


554 


AB025258 


Mus 


granuphilin-a 


501 


41 






musculus 








555 


AJ010346 


Homo sapiens 


RING-H2 


1468 


100 


556 


W92388 


Homo sapiens 


Human TR-interacting protein S239a. 


538 


92 


557 


AF 119851 


Homo sapiens 


PRO 1722 


175 


59 


558 


AF 117756 


Homo sapiens 


thyroid hormone receptor-associated protein 


183 


32 








complex component TRAP150 






559 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


319 


68 


560 


D86214 


Mus 


Ca2+ dependent activator protein for secretion 


1010 


93 






musculus 








561 


AFI87325 


Can is 


melanoma antigen 


287 


55 






familiaris 








562 


AJ001981 


Homo sapiens 


OXA1L 


2512 


99 


563 


Z17238 


Rattus 


glutamate receptor subtype delta-1 


338 


66 






norvegicus 








564 


W30638 


Homo sapiens 


Partial human 7-transmembrane receptor 


371 


100 






HAPOI67 protein. 






565 


AC005620 


Homo sapiens 


R33590 1 


467 


97 


566 


Y99358 


Homo sapiens 


Human PRO 1772 (UNQ834) amino acid 


1138 


78 








sequence SEQ IDNO:63. 






567 


AL031177 


Homo sapiens 


dJ889M153 (novel protein) 


1002 


58 


568 


AFI51043 


Homo sapiens 


HSPC209 


798 


100 
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SEQ 

ID 

"Km. 
IN (J. 


Accession 
No. 


Species 


Description 


Smith- | % 
Waterman i Identity 
Score [ 


5oy 




— — : 

Homo sapiens 


— — 

liver-spec ltic transporter 


231 


100 






Homo sapiens 


iviissnapen/iNijs.-reiaiea Kinase ivu in iv- i 


1532 


100 


Ml 

d y l 




nuuio sapiens 


Colon cancer associated antigen precursor 
sequence. 


1064 


100 




A1_,UJ 11// 


Homo sapiens 


rlT&£OX>f1< 1 Annual nmtnint 

ujooyjvuj.j ^novci proieinj 


735 | 55 


<T1 




Homo sapiens 


Mernorane-Douna protein riwiyu. 


254 


45 


J /4 


ArtUJ / 1 Uo 


Homo sapiens 


seven transmembrane domain orphan receptor 


1883 


99 


575 


D43949 


Homo sapiens 


This gene is novel. 


836 


100 


3 /D 




Homo sapiens 


Human breast tumour-associated protein 57. 


108 


50 


577 


G00352 


Homo sapiens 


Human secreted protein, bfcQ JUJ NO: 4433. 


141 


75 


578 


R95913 


Homo sapiens 


Neural thread protein. 


140 65 


579 


AK0251 16 


Homo sapiens 


unnamed protein product 


201 70 


580 


Y 86473 


Homo sapiens 


Human gene 52-encoded protein fragment, SEQ 
ID NO:388. 


77 


70 


581 


AF 196779 


Homo sapiens 


JM10 protein 


450 


100 


582 


AF 1 88706 


Homo sapiens 


g20 protein 


330 


98 


583 


AB030234 


Canis 
familiaris 


D4 dopamine receptor 


64 


56 


584 


G02621 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6702. 


345 


90 


585 


AL096828 


Homo sapiens 


dJ963E22.I (Novel protein similar to NY-REN-2 
Antigen) 


268 


85 


586 


Y30819 


Homo sapiens 


Human secreted protein encoded from gene 9. 


235 


35 


587 


G00357 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4438. 


132 


56 


588 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


182 


79 


589 


AF235017 


Mus 

musculus 


2P1 protein 


764 


80 


590 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


329 


81 


591 


Y30709 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


110 


43 


592 


Y53875 


Homo sapiens 


A human seven transmembrane signal transducer 
polypeptide. 


1369 


92 


593 


Y53051 


Homo sapiens 


Human secreted protein clone ddl 19_4 protein 
sequence SEQ ID NO: 108. 


1112 


97 


594 


Y27658 


Homo sapiens 


Human secreted protein encoded by gene No. 92. 


763 


79 


595 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


156 


58 


596 


AFI511I0 


Mus 

musculus 


COP I protein 


2215 


95 


597 


G03786 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7867. 


157 


65 


598 


AF 192499 


Mus 

musculus 


putative secreted protein ZMG37 


143 


40 


599 


AF1 19855 


Homo sapiens 


PRO 1847 


236 


76 


600 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


212 


73 


601 


Y00295 


Homo sapiens 


Human secreted protein encoded by gene 38. 


567 


88 




Ar 184971 


Homo sapiens 


class 11 cytokine receptor Z.LY 1 OK7 


2015 


74 


603 


AF061936 


Homo sapiens 


diacylglycerol kinase iota 


773 


96 


604 


AL 096828 


Homo sapiens 


dJ963E22.1 (Novel protein similar to NY-REN-2 
Antigen) 


1333 


93 


605 


A All 1 AjT 

AB033100 


Homo sapiens 


KJAA1280 protein 


3915 


100 


606 


X7S756 


Homo sapiens 


protein kinase C mu 


3916 


99 


607 


D86983 


Homo sapiens 


similar to D.melanogaster peroxidasin(Ul 1052) 


5758 


99 


608 


W69341 


Homo sapiens 


Secreted protein of clone CQ279_1, 


1377 


99 


609 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


339 


82 


OlU 


YZ7oo» 


Homo sapiens 


Human secreted protein encoded by gene No. 
107. 


116 


62 


611 


AF202636 


Homo sapiens 


angiopoietin-like protein PP1 158 


2164 


100 


612 


AF090944 


Homo sapiens 


PRO0663 


218 


82 


613 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


195 


59 


614 


M87053 


Rattus 
norvegicus 


lens membrane protein 


450 


84 


615 


AC004232 


Homo sapTens 


FPM315 


163 


37 


616 


GO 1984 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6065. 


205 


79 
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bh.Q 

IXJ 

NO- 


Accession 

ITU. 


Species 


ucscnpvion 


aniiiii- 
Score 


OA 
/O 


617 


Y91524 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 74 SEQ ID NO: 197. 


821 


99 


61S 


AJ245621 


Homo sapiens 


CTL2 protein 


2258 


99 


619 


Y76198 


Homo sapiens 


Human secreted protein encoded by gene 75. 


108 


64 


620 


AF067S64 


Homo sapiens 


transferrin receptor 2 alpha 


3922 

. 


94 


A7 1 
02. 1 




cscncricma 
Co] J 


i raiiam emu nine proicui appv^ 


j / j 






W7SRSR 


nomu sapiens 


ruimaii icv.rcioiy pruicm oi nunc v-o / 


/ j\J 


ion 






nOHlu 5apiCH5 


t-TurrtQTi cf*rrr*t*^ r\rr*t*m vh1 ? 1 CPH IPk ~\Jf~l*4 

nuruau scLr&ieu proicm vp i x i , oev^ i nu.^. 




inn 


O.Z4 




Mus 

musculus 


T MYnRfl 


fin 


HI 

OJ 






Paramecium 
uursaiia 

virus 1 


rro-ncn, ir/rrrNivibLrLfO [yxj 


QA 


AA 


626 


U79260 


Homo sapiens 


unknown 


194 


70 


627 


R95913 


Homo sapiens 


Neural thread protein. 


99 


50 


rt9R 




nornu s&picus 


Human em»rf»t-»»H nrntpin CPf) in >J( - V 7S^1 


427 


100 


629 


Y36281 


Homo sapiens 


Human secreted protein encoded by gene 58. 


590 


100 




Y 026^3 


Homo sapiens 


Human secreted protein encoded by gene 44 
cjone rll UAUZx. 


1 £. C 

lb5 


VO 


1 

OJ 1 




riomo sapiens 


Unman c*(«rAtAH AprttAin f~^t ZT\ "KIO ■ /^T~) r\ 

rtiiman secrereo proiein, sty iu inu. Oi^u. 


9fiS 
ZDS 




632 


U I 6996 


Homo sapiens 


protein tyrosine posphatase 


351 


80 


63 J 


A t?t"i1 OCT 

Ar 12 J 857 


Homo sapiens 


sorting nexin 7 




T Art 

I w 


634 


A T?^ O 1 Til 

Ar2o J772 


Homo sapiens 


similar to Homo sapiens ribosomal protein L10 
encoded by GenBank Accession Number 


340 


77 


635 


Y07090 


Homo sapiens 


Renal cancer associated antigen precursor 
sequence. 


277 


64 


636 


ABO 13382 


Homo sapiens 


DUSP6 


414 


76 


637 


Cj02S72 


Homo sapiens 


Human secreted protein, bfav m NL>; 6^53. 


315 


71 


638 


M95762 


Rattus 
norvegicus 


GAB A transporter 


924 


on 

tsy 




Cj037oy 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


ziy 


OV 


640 


YUI400 


Homo sapiens 


Secreted protein encoded by gene 1 8 clone 

XTjNrir (Jxy. 


137 


/V 


C>4 1 


AL.UU8U I j 


— — r-j ; 

Arab id ops is 
thai i ana 


FOAT* A 
rZ4J j. 4 


1 Z I 








riomo sapiens 


riuman secreieu proiciri encoueu oy gene yo 


O i J 




643 


ABO 15982 


Homo sapiens 


serine/threonine kinase 


485 


98 


Cr»*r 


V9^Rnfi 


numo sapiens 


Miimnn c^/*v^t^^l ntvxt^in "n^om^nt ^v\ffv\^fi frr^ tv% 

riurnoii sccrcicu protein iragnicni encoacu irom 
gene 23. 






645 






ty\«tyi n rtt*nt"^iTi Fl API O 


474 


100 


646 








200 


38 


<J47 


W48804 


Homo sapiens 


Homo sapiens clone BK158__1 protein. 


1203 


99 


648 


/vr / _> j u 






1440 


98 


649 


Y36203 


Homo sapiens 


Human secreted protein #75. 


233 


73 


WJU 




Homo sapiens 


Unman c^r*-rAt<wH nmtprn CT?0 T n M {~\ • XQ<7 

nuuitui seucicu protein, ocy iu inw. oyjj, 


171 


7R 


Q J J 


V171 OQ 


Homo sapiens 


Human receptor molecule (R£C) encoded by 

Inrvtr flnn^ 909917Q 


i ni7 

J UIjL 


1 lift 


652 


AB0329O9 


Hylobates 


dopamine receptor D4 


122 


32 






noiiiu sapiens 


unnoincu proccin prvuuci 


I oo 




654 


W73411 


Homo sapiens 


Human secreted protein encoded by Gene No. 
15. 


57 


37 


655 


L22455 


Rattus 
norvegicus 


mu opioid receptor 


116 


34 


656 


G03112 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7193. 


110 


45 


657 


G02345 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6426 


459 


97 


658 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


291 


75 


659 


G02832 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6913. 


134 


65 


660 


Y91423 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 11 SEQ ID NO: 144. 


333 


96 



1L8 
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b£Q 
li-i 


Accession 

iNO. 


Species 


Description 


Lit 

Score 


OA 
/© 

l<1^n titv 

l VJ l_I IT 


661 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


168 


68 


662 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-6. 


375 


43 


663 


W75771 


Homo sapiens 


Human OTP binding protein APD08. 


629 


100 




AT 09fi77n 


J 1UJI lU JtXVJ i <wi J o 


bA150A6.2 (novel 7 transmembrane receptor 
(rhodopsin family) (olfactory receptor like) 
orotein fhs6Ml-211) 


480 


55 




A.R037734 


A 1V1 1 JU J (*|J 1 ^ 1 1 *J 


K1AA1313 orotein 


978 


96 


ftfifi 

\J\J\J 


W82841 




T-liimsin cerehral nrotein-1 


192 


84 


w / 


W82841 




T-Inmsin c*^rfthral nrntein-1 


182 


87 


' Ate ' 


ADuJUl(rt 


musculus 

JJ1UJVIU i*** 


nindinp refnon 


757 


68 


669 


AB032919 


T4v1nrmte«c 
muelleri 


rtnnamine recentor 04 


85 


37 


670 


AF 1 (172QS 


Rattus 
norvegicus 


nufAT mftmhranc nrotein 

UUlvl HlVltlUl Ul v pi V l-w 111 


746 


81 


A71 
U / 1 




Vlnmn csinipn^ 

i A\J1 il\J JtiyH^Al^ 


IpiiWwt.p KTirfhr^ nrnt^in 


394 


93 




V¥ ojvtJO 


\-tf\rr\r\ <;ahi^n^ 




261 


91 


673 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


106 


48 


Of** 


at ni^^ft7 


Homo sapiens 




2388 


99 


O fJ 


I JTOOo 


noma sapiens 


oecrcico prviciii i uo uuj j \tl i i l, 


1 1 Id. 


53 


o /o 




Homo sapiens 


XTiunan secreicu protein, ocy iu ivu. / o / o. 


1 1A 
1 f*r 


1A 


677 


AF026954 


Bos taurus 


pyruvate dehydrogenase phosphatase regulatory 
subunit precursor; PDPr 


1013 


95 


678 


LI 1625 


Mus 

musculus 


receptor protein-tyTosine kinase 


545 


96 


o/y 


ALlAi J 427 


Homo sapiens 


d J 1 o / A i y.J (novel protein ) 


7A^ 


HAJ 


fort 

ooU 


AJ Ui4JU 


MUS 

musculus 


olfactory receptor 




77 


681 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 


179 


70 




(Tea too 


Homo sapiens 


riuman secretea proicin, ot ^ iu inu, / o / u. 




/o 


683 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4_l protein 

SeO^UCnCC SCy LU 


118 


100 


684 


U43360 


Peromyscus 
man i cu latus 


reverse transcriptase 


100 


37 


685 


G0O885 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4966. 


162 


60 


£. OJC 
OOU 


AKUUI5 io 


Homo sapiens 


unnamed protein product 


jyu 


1 uu 


687 




Homo sapiens 


riuman secretea protein, afcQ iu isu. ouoj. 


/ 18 


i nn 


(•OO 




Homo sapiens 


Human cancer associated antigen precursor 

f\AC\ TJ KKI AfC\ 






ooy 




-- -- - 

Oacn ornabditi 

a dCglllld 


coniains simiitinQf to iR.r / ojid 




36 


690 


Y27868 


Homo sapiens 


Human secreted protein encoded by gene No. 
107. 


183 


81 


691 


Y56514 


Homo sapiens 


Human Jurkat cell clone P2-15 AIM 10 longest 
ORF protein sequence. 


180 


88 


692 


Y27795 


Homo sapiens 


Human secreted protein encoded by gene No. 79. 


1539 


99 




I JvjaQO 




T-li im ftn c^prptpri r\rotf*tt*i ^nprvi^rl hv dptip jIS 

Al Ull J«UI >^UbtvU L/l VJ L W-U J ^U^VA4&kl Vjf f^mili^ 'VJ* 


428 


98 




U 1 iWJ 






308 


89 


695 


Y45272 


Hnmn wnirn^ 


1-Tiirrian ^crrptcH nrntein ftncndEri from oene lri 

llUlf lull o»*vi Vl**U Ul ULVU1 Vlll'VUWAl llufll gblJ\j 1U« 


1517 


99 


696 


AF191838 


Homo sapiens 


TANK binding kinase TBK1 


1242 


98 


o> / 


I 


Homo sapiens 


iTLlf Iloll aC^iClCU prut-Cm CllkrUUCU oy ^CIIC *r*r 

Hone HTDAD2X 

V>JV>1 J W fJ *- u/U>ruH 


275 


75 


698 


Y87280 


Homo sapiens 


Human signal peptide containing protein HSPP- 
^7 *?Pr» TTl NO -57 


576 


90 


699 


Y97999 


Homo sapiens 


Human SCAD family molecule HSFM-1, SEQ 
lDNO:l. 


729 


99 


700 


AJ006701 


Homo sapiens 


putative serine/ threonine protein kinase 


610 


79 


701 


AF209198 


Homo sapiens 


zinc finger protein 277 


2357 


100 


702 


AJ298841 


Mus 

musculus 


torsinA protein 


709 


45 


703 


AK02I729 


Homo sapiens 


unnamed protein product 


622 


98 


704 


Z46787 


Caenorhabditi 
s elegans 


similar to Glutaredoxin, Zinc finger, C3HC4 
type (RING finger) 


920 


51 


705 


G02882 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6963. 


589 


98 
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SEQ 
ID 

NU: 


Accession 
No. 


Species 


Description 


amitn- 

Waterman 

Score 


% 

Identity 


irui 
/Uo 


nA7 sfi i 

UUijU J 


— — J : 

rlOTTlO SapiSn5 


xiumaii secreted pruicui, oc>Vf ■u-' o jaz. 


1 7^ 

IZJ 


sit 

JO 


707 


RS>5326 


Homo sapiens 


Tumor necrosis factor receptor 1 death domain 
iigano (^cionc 


121 


95 


*7nc 
/UB 




; 

riorno sapiens 


riuiiiaii sccrcieu pruicm, jcy x\j tvoj* 


1 7^ 


■iQ 

J7 






rioruo sapiens 


lKappaD JC1T185C ^LN-Pwl OlTlQing prULCUi, I XrlJO. 


Sift 
j 1 1> 




71 ft 

1 \ 0 


rviojj / / 


o ace narumy c 
cs tcrc v i3iac 




111 




71 1 




Rattus 
norvegicus 


OjVsOIVJ CUkr^s \j 1 KJ r\. JJf 1 J LI Initio Vi 


467 


85 


717 

/ 


n7i7i i 




protein tyrosine p n o&pna t <igc \tr i r~OAj, type j ) 




*l*r 






Maimota 
marmots 


olfactory receptor 


A1 < 


01 
8 J 


714 




Honio s &p tens 


rturnan secrete a protein, oti^ iu rvu. /o**z. 


7S 1 
Zj 1 


1 r\A 


T 1 S 




Homo sapiens 


VTA A 101£ nrrtUm 

rviAAJZjo protein 




1 An 
1UU 


716 


G00577 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4658. 


80 


73 


/ 17 




Homo sapiens 


i>fcQ. ID. 37 trom WO0034474, 




yy 


718 


AJ243396 


Homo sapiens 


voltage-gated sodium channel beta-3 subunit 


234 


100 


719 


U47334 


Homo sapiens 


similar to chicken gamma aminobutync acid 
receptor bcta4 subunit 


578 


99 


720 


AB020598 


Homo sapiens 


peptide transporter 3 


1096 


100 


721 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-3. 


570 


74 


722 


J05046 


Homo sapiens 


insulin receptor-related receptor 


6787 


100 


723 


AF001958 


Amby stoma 
tigrinum 


electrogenic Na+ bicarbonate cotransporter, 
NBC 


111 


41 


724 


AP 127084 


Mus 

musculus 


semaphonn cytoplasmic domain-associated 
protein 3 A 


5253 


94 


725 


X54673 


Homo sapiens 


GABA transporter 


3114 


99 


726 


AF016191 


Rattus 
norvegicus 


potassium channel 


370 


100 


727 


AB029559 


Rattus 
norvegicus 


BAT1 


139 


35 


728 


Y28503 


Homo sapiens 


HGFH3 Human Growth Factor Homologuc 3. 


2186 


97 


729 


AJ011415 


Homo sapiens 


plexin-Bl/SEP receptor 


729 


56 


730 


Z93096 


Homo sapiens 


bK390B3.1 (manic fringe (Drosophi la) 
homolog) 


142 


68 


731 


Z10062 


Homo sapiens 


cDNA encoding a human vanilloid receptor 
homologue Vanilrepl . 


675 


99 


732 


AF 161 382 


Homo sapiens 


HSPC264 


492 


94 


733 


AB029033 


Homo sapiens 


KIAA1 1 10 protein 


3826 


99 


734 


AE000493 


Escherichia 
coli 


putative transport protein 


592 


97 


735 


a t /\"i 

AL033379 


Homo sapiens 


dJ417022.2 (novel 7 transmembrane receptor 
(rhodopsin family) protein- 3imilar to high- 
affinity lysophosphatidic acid receptor homolog) 


2173 


99 


f JSJ 


/ vr i jzj77 


rlUiIlO SiipicHo 


c\/\iniijo laciui ui iaLC acuvaiCu i iyrnpnoisyic5~ 

1 
1 




JD 


737 


X55019 


Homo sapiens 


acetylcholine receptor delta subunit 


883 


99 


71S 
'Jo 


Yoi on/i 


Homo sapiens 


voiiagc-gaico cnionac ion cnannci 


1 07R 


i nn 

1UU 






Homo sapiens 


organic anion transporter 4 






740 


D00570 


Mus 

musculus 


open reading frame (196 AA) 


83 


24 


741 


W03626 


Homo sapiens 


Human thyrotropin GPR N-terminal sequence. 


118 


40 


1*\A 




Homo sapiens 


V_segment translation product 


014 


1UO 




A C 1 1 DO 1 < 


Homo sapiens 


G-p rote in -co up led receptor 




OA 


'744 




Homo sapiens 


nucinaiopoieiic imcagc ecu protein ^/\/\ i-*oo^ 


1 dft 
its 


7J 


745 


W67838 


Homo sapiens 


Human secreted protein encoded by gene 32 
clone HLTCJ63. 


448 


95 


746 


W57260 


Homo sapiens 


Human semaphorin Y. 


2414 


100 


747 


W21578 


Homo sapiens 


Alzheimer's disease protein encoded by DNA 
from plasmid pGCS2232. 


968 


65 


748 


Y94935 


Homo sapiens 


Human secreted protein clone yd2l R \ protein 
sequence SEQ ID NO:76. 


622 


100 


749 


AL022238 


Homo sapiens 


dJ1042K.10.5 (novel protein) 


314 


85 


750 


G03889 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7970. 


391 


87 
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SEQ 
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Smith- 


% 


ID 


No. 






Waterman 


Identity 


NO: 








Score 




751 


AB 0252 5 8 


Mus 


granuphilin-a 


773 


41 






musculus 








752 


Y52386 


Homo sapiens 


Human transmembrane protein H^uilMJ, 


ftnft 

yW 


yy 


753 


Y48586 


Homo sapiens 


Human breast tumour-associated proteiu 47. 




yy 


754 


AJ272207 


Homo sapiens 


putative G protein-coupled receptor 92 


oy4 


1 An 


755 


M85I83 


Rattus 


vasopressin receptor 


979 


68 






norvegicus 








756 


AP19050L 


Homo sapiens 


leucine-rich repeat-containing G protein-coupled 


388 


71 








receptor 6 






757 


Y02692 


Homo sapiens 


Human secreted protein encoded by gene 43 


461 


on 
8/ 








clone HI AUJt.1 /. 






758 


Z22535 


Homo sapiens 


ALK-3 


439 


98 


759 


R04932 


Homo sapiens 


Interferon-gamma receptor segment from clone 


564 


97 








39 responsiblefor binding the target 






760 


W74902 


Homo sapiens 


Human secreted protein encoded by gene 1 75 


1217 


99 








clone HE8BI92. 






761 


G03706 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7787. 


223 


88 


762 


AB 02 0676 


Homo sapiens 


K1AA0869 protein 


4433 


99 


763 


AK026992 


Homo sapiens 


unnamed protein product 


2285 


99 


764 


AF1 73358 


Homo sapiens 


glucocorticoid receptor AF-1 coactivator-l 


573 


100 


765 


AF268066 


Mus 


netrin 4 


2019 


89 






musculus 








766 


Y48585 


Homo sapiens 


Human breast tumour-associated protein 46. 


1 169 


89 


767 


AF230378 


Mus 


interlcukin-1 delta 


309 


45 






musculus 








768 


Af 121975 


Mus 


odorant receptor S 1 8 


268 


62 






musculus 








769 


AB008515 


Homo sapiens 


RanBPM 


611 


57 


770 


Y09945 


Rattus 


putative integral membrane transport protein 


458 


50 






norvegicus 








771 


AF22673 1 


Homo sapiens 


AD026 


688 


99 


772 


Y27132 


Homo sapiens 


Human glioblastoma-derived polypeptide (clone 


1384 


100 








OA004FG). 






773 


X87832 


Homo sapiens 


NOV/plexin-Al protein 


1821 


98 


774 


AB025258 


Mus 


granuphilin-a 


500 


41 






musculus 








775 


AF125101 


Homo sapiens 


HSPC040 protein 


232 


93 


776 


G02815 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6896. 


314 


95 


777 


G02493 


Homo sapiens 


Human secreted protein, SEQ ID NO; 6574. 


191 


68 


778 


R03301 


Homo sapiens 


Sequence of pre-human atrial natriuretic peptide. 


213 


45 


779 


AL357374 


Homo sapiens 


bA353C18.2 (novel protein) 


232 


100 


780 


API 00346 


Homo sapiens 


neuronal voltage gated calcium channel gamma- 


1434 


89 








3 subunit 






781 


Y 19566 


Homo sapiens 


Amino acid sequence of a human secreted 


103 


52 








protein. 






782 


Y36233 


Homo sapiens 


Human secreted protein encoded by gene 10. 


1 AOO 

luys 


yj 


783 


AF 084 4 64 


Rattus 


GTP-binding protein REM2 


141 


30 






norvegicus 








784 




Homo sapiens 


Human low density lipoprotein binding protein 




oo 
yy 














783 




: 

Homo sapiens 


riuv I 


1 QTLL 


j 1 


no/ 

78o 


Y91S7Q 


Homo sapiens 


Human apoptosis related protein. 


34 / 


i r\n 
I W 


787 


Y710O2 


Homo sapiens 


Human membrane transport protein, M 1 Ki - /. 


I \Jt>Z 


OA 

y4 


788 


API 17754 


Homo sapiens 


thyroid hormone receptor-associated protein 


S0«4 


no 

ys 








complex component TRAP240 






TOO 


a r ruoc£a 


Homo sapiens 


aJJ l u.j (novel A i rase) 




yo 


790 


API 5 1848 


Homo sapiens 


CGI-90 protein 


745 


96 


791 


Y08639 


Homo sapiens 


nuclear orphan receptor ROR-beta 


1421 


95 


792 


Y41706 


Homo sapiens 


Human PR038 1 protein sequence. 


644 


99 


793 


AF121228 


Homo sapiens 


thyroid hormone receptor-associated protein 


1037 


100 








complex component TRAP95 






794 


G04O72 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8153. 


124 


62 


795 


Y69384 


Homo sapiens 


Amino acid sequence of a 14274 receptor 


119 


100 








protein. 






796 


W40215 


Homo sapiens 


Human macrophage antigen. 


1358 


99 
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SEQ 


Accession 


Species 


Description 


Smith- 


% 


ID 


NO. 






Wdtcrrnan 


Identity 


ssyj. 








Score 




nan 




—— : ; 

Homo sapiens 


— rr~ j : -J -r— = — 

hepatocellular carcinoma-associated antigen 112 


llJi 




TOO 


API <QX t S 

AT 1 2>yo I z> 


Homo sapiens 


FGF receptor activating protein 1 


*o 1 


OS 

yo 


/yy 


I J700J 


Homo sapiens 


numan normal uterus iissuc aenvcu piuicui -f-o. 


7Q7 


oo 




VJtfXA ^0 


Homo sapiens 


Human Tl -receptor ligand III splice variant 2. 


5 II. 


yz 


SOI 


L.UUU 15 


Homo sapiens 


renin 


inn 

ty if 


yj 


0U2 


PQ'T'T 1 Q 


Homo sapiens 


lki protein. 


t t Q« 

1 1 yo^ 


y / 






(human) 








ouj 


VI 

AIDjj / 


Homo sapiens 


AiNr-A receptor preprotcin ^aa -jZ to tuzy) 


c i cko 
j 1 yy 


yo 


sew 




Homo sapiens 


Human secreted protein from clone EC 1 72_1 . 


jnip 
4U I 8 


yj 


tfU5 


AJ24J5 /4 


Homo sapiens 


o 1 i gophren in-4 


Z\)o ( 


i An 


806 


G01731 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5812. 


284 


100 


807 


Z24680 


Homo sapiens 


garp 


1562 


83 


808 


AF171669 


Homo sapiens 


glycoprotein -associated amino acid transporter 


1364 


90 








LAT2 






809 


W70321 


Homo sapiens 


Secreted protein CC198_1. 


1 154 


96 


810 


W74843 


Homo sapiens 


Human secreted protein encoded by gene 1 1 5 


855 


99 








clone HOVBA03. 






811 


AF 108831 


Homo sapiens 


K:C1 cotransporter 3 


4561 


100 


812 


AF092135 


Homo sapiens 


FTDO 14 


862 


100 


813 


AF 283 772 


Homo sapiens 


similar to Homo sapiens ribosomal protein LI 0 


784 


100 








encoded by GenBank Accession Number 












L25899 






814 


. 

G01563 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5644. 


330 


100 


815 


AF051 151 


Homo sapiens 


Totl/interleukln-l receptor-like protein 3 


3850 


99 


816 


W95630 


Homo sapiens 


Homo sapiens secreted protein gene clone 


358 


100 








gnI14_l. 






817 


GO 1082 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5163. 


549 


100 


818 


AF151800 


Homo sapiens 


CGI-41 protein 


1 106 


95 


819 


L00352 


Homo sapiens 


!ow density lipoprotein receptor 


3980 


100 


820 


X04434 


Homo sapiens 


IGF-I receptor 


5832 


99 


821 


G03844 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7925. 


572 


100 


822 


AF2 12220 


Homo sapiens 


TERA 


396 


48 


823 


Y50125 


Homo sapiens 


Human glycophosphatidylinositol -anchored 


4897 


99 








protein GPI-122. 






824 


AF 156778 


Homo sapiens 


ASB-3 protein 


2675 


98 


825 


AF096322 


Homo sapiens 


neuronal voltage-gated calcium channel gamma- 


1 105 


100 








2 subunit 






826 


Y07972 


Homo sapiens 


Human secreted protein fragment #2 encoded 


1540 


100 








from gene 28. 






827 


AB032013 


Homo sapiens 


potassium channel KvS.l 


2435 


95 


828 


Y13620 


Homo sapiens 


BCL9 


5284 


96 


829 


Y91474 


Homo sapiens 


Human secreted protein sequence encoded by 


541 


98 








gene 24 SEQ LDNO:147. 


... 




830 


X54232 


Homo sapiens 


glypican 


1625 


Of 
Of 


831 


X14830 


Homo sapiens 


acetylcholine receptor beia-subunii preprotein 


2540 


100 




Y7 I 262 


Homo sapiens 


Human cbondromodulin-like protein, Zchml. 


1 A/11 

IUU2 


lw 


833 




Homo sapiens 


Human secreted protein, abQ IL> NO. 7y54. 




yo 


Q1A 
834 


AL-UU3U3U 


Homo sapiens 




13oy 


yj 


835 


Y35422 


Homo sapiens 


Human secreted protein. 


964 


S7 


836 


U41 557 


Caenorhabditi 


glycine-rich 


O K. 

03 


36 






s elegans 








837 


AT 1 n oon 

AL121889 


Homo sapiens 


dJ107ocl7.1 (KIAA0823 protein (continues in 


998 


75 














mo 

Sjo 


AJU 11413 


Homo sapiens 


piexin-tH/acr receptor 


1 3oU 


£n 

ol) 


RIO 




iL(jii]<j sapiens 


j\ sec rc icq protein cncoAj cq oy cionc i 


1 1 OS 

J l VD 


o / 


840 


G00862 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4943. 


255 


92 


841 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6731. 


644 


97 


842 


AF036717 


Homo sapiens 


FGFR signalling adaptor SNT-1 


2629 


99 


843 


Y73446 


Homo sapiens 


Human secreted protein clone yc27_l protein 


1089 


100 








sequence SEQ ID NO: \ 1 4. 






844 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


357 


69 


845 


AF151810 


Homo sapiens 


CGI-52 protein 


1443 


88 


846 


X83378 


Homo sapiens 


putative chloride channel 


1620 


99 


847 


AC004883 


Homo sapiens 


similar to general transcription factor 21; similar j 


655 


96 



122 
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Smith- 


% 


ID 


Kin 






waterman 


Identity 


INU. 








Score 










to AF03RQfiQ fPin-cr?Ji772071 






84 a 


Y DOSHA 


— — : 

Homo sapiens 


monocyte chemo tactic protein- 2 


i tin 

1 OiJ 


7A 

/o 






Homo sapiens 


similar to mouse olfactory receptor 13; similar to 




Qft 








P34984 CPIDe464305> 








ARfnJl737 


I1U1IIU ACl^JlwJlA 


VJ ^Ji Ulblil— ^UUL/IC3U ICLCpiUI V J 1 .i. 


1767 


100 


851 


AF 124490 






3415 


Q8 




Y86217 


T-Tnmrt tt**ni**Yl^ 


Human irrrrtrri nrntrin HWHfitlSd <5FO TD 


1 1 RQ 

1 1 07 


OQ 








NO 119 






O JJ 


AF22474 1 




f*hlf\riH^ f*hatm**t rot p i o 7 


3748 






A 1 / uy*t 


i lUiiiu sapiens 




JjjU 


OO 




W /oZ^j 


Homo sapiens 


Fragment of human secreted protein encoded by 


1 1A < 

1Z45 


yy 








gene 1 7* 






O JO 


I\J 'JU7 


numo sapiens 


unci jcumu-z receptor associated proicm p4j. 




i nn 


60 / 


vai 

T41 ;oj 


Horno sapiens 


Human PRO 1083 protein sequence. 


n 1 1 


yy 


O Do 


AFrt^TW; 
rVT U J fj\K> 


Homo sapiens 


transmembrane proteolipid 


Ant 


OA 

04 


OJ7 




Homo sapiens 


uiinarncu proiein prLmut^r 






fi£0 

sou 


VA 1111 


Homo sapiens 


Human secreted protein encoded by gene 5 clone 




i nr\ 
IUU 








HLUKM43. 








i&j / /£> 


Homo sapiens 


Human secreted protein encoded from gene 66. 


DOC 


yy 


oo3 


vi^ too 


Homo sapiens 


Human prostate tumor EST fragment derived 




30 








protein ff375. 






oo4 


At Id /4 / j 


— : 

Homo sapiens 


heme-bindmg protein 


OTA 


OO 

yy 






Homo sapiens 


Human secreted protein, ID NO: oo J J . 


Til 
Zli 


6/ 


obo 


A5487U 


Homo sapiens 


Type II integral membrane protein 


12U1 


100 


Co / 


(jvU7UU 


Homo sapiens 


Human secreted pro tern, at<j UJ NO. 4/01. 


o4U 


99 


868 


YU/S94 


Homo sapiens 


IT. ^ a J i * £ *. J J C 

Human secreted protein fragment encoded from 


388 


88 








gene 43. 






ooy 


TOO 1 Tl 


Homo sapiens 


pre pro enkephalin ( 




yj 


em 


vo i fin i 


Homo sapiens 


Human secreted protein sequence encoded by 


1 r\>i p 


yo 








rma.^^. n cert it~\ XTrA^n^ 

gene £j oh^i \l> InL>. JUa. 






o/i 


T Oyl 3 11 
1 1 


Homo sapiens 


GABA-alpha receptor beta-3 subunit 


/ 


93 


872 


Y2y9oo 


Homo sapiens 


Human cytokine family member EF-7 protein. 


960 


94 


873 


Ar loI352 


Homo sapiens 


H5rC2o4 


11 24 


99 


874 


G03412 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7493. 


464 


100 


0 Ij 


Y.Z tj ) Z 


Homo sapiens 


Human secreted protein encoded by gene No. 6. 


373 


y& 


B /D 


\I1 <<10 


Homo sapiens 


h*-ceJI growth tactor 


171 


56 


877 


W63681 


Homo sapiens 


Human secreted protein 1. 


1652 


99 


(inn 




Rattus 


neurexophilin 


144S 


OO 1 

98 






norvegicus 








o /y 




Homo sapiens 


— r : ^ 

Amino acid sequence of a human secreted 


3ZI 


JUU 








protein. 






sen 




Homo sapiens 


Polypeptide fragment encoded by gene 144. 


yjo 


1 OO. 


oo 1 


API 1 ftAld 
J\T 1 1 OO / U 


Homo sapiens 


orphan G protein-coupled receptor 


i mi 


TOO 

IUU 






nomo sapiens 


LUlu 


jZo 


1 nn 

IUU 


RSI 


I 1 040Z 


Homo sapiens 


cathepsin L 






oof 


i y**v3\j 


riomo sapiens 


Human secreted protein clone dh!073 12 protein 




inn 

IUU 














885 


AF070661 


nijiiiiJ j afj i ei j 3 






inn 

i m/ 


886 


Y04315 








100 


887 




nomo I CJ 15 


noi>i 


^7^ 


inn 

IUU 


888 


I *At7U 


nomo Sapiens 


LnfitTnnnn o^/^rvt^/*! nmt^in o^viti^n/^j* plnn^ 

numfln sccreicQ pruicin scc|ucncc cionc 


OQd 


OA 
yt 








r-nA71 S 








Y4 1 703 


nuiiivj sapiens 


nujiitui soiuoic proicm z-i tvurvj-i . 


HJ77 


QQ 


890 


G03714 


nomu sapiens 


'Human wrptrH nrnt^in HRfl ID XIO* 77QS 


147 
l ^ / 


DJ 


891 


AF208856 


Homo sapiens 


BM-014 


1012 


99 


892 


U29195 


Homo sapiens 


neuronal pentraxin 11 


2002 


98 


893 


X68149 


Homo sapiens 


Burkitt lymphoma receptor 1 


1953 


100 


894 


Y94914 


Homo sapiens 


Human secreted protein clone pw337_6 protein 


537 


100 








sequence SEQ ID NO:34. 






895 


W61630 


Homo sapiens 


Clone HNFGW06 of EGFR receptor family. 


326 


63 


896 


M24110 


Homo sapiens 


GO S 19-2 peptide precursor 


481 


100 


897 


Z68747 


Homo sapiens 


imogen 38 


2018 


99 


898 


AF1861I2 


Homo sapiens 


neurokinin B-Iike protein ZNEUROKI 


619 


100 


899 


AF225420 


Homo sapiens 


AD025 


734 


100 
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Smith- 
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Score 
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Identity 


900 


P60657 


Homo sapiens 


Sequence of human lipocortin. 


1835 


100 


901 


M27288 


Homo sapiens 


oncostatin M 


1297 


99 


902 


W85737 


Homo sapiens 

N £ 1 


Polypeptide with transmembrane domain. 


749 


100 


903 


GO 1349 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5430. 


650 


99 


904 


Y00261 


Homo sapiens 


Human secreted protein encoded by gene 4. 


1133 


99 


905 


AF039688 


Homo sapiens 


antigen NY-CO-3 


771 


99 


906 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


907 


ABO 17507 


Homo sapiens 


Ape 12 


224 


100 


908 


AK000056 


Homo sapiens 


unnamed protein product 


1537 


98 


909 


Y 86299 


Homo sapiens 


Human secreted protein HFOXB55, SEQ ID 
NO:2I4. 


427 


100 


910 


Af 23 1023 


Homo sapiens 


protocadherin Flamingo 1 


7393 


99 


91 1 


Y14134 


Homo sapiens 


Vascular endothelial cell growth inhibitor beta 
protein sequence. 


1319 


100 


912 


290420 


Homo sapiens 


Human GDF-3 (hGDF-3) polypeptide encoding 
cDNA. 


1950 


100 


913 


Y 19757 


Homo sapiens 


SEQ ID NO 475 from W09922243. 


1361 


100 


914 


G03172 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7253. 


112 


48 


915 


U 14971 


Homo sapiens 


ribosornaJ protein S9 


886 


90 


916 


AF 1 72854 


Homo sapiens 


card iotroph in-like cytokine CLC 


1204 


99 


917 


AC005525 


Homo sapiens 


F22162 1 


1963 


100 


918 


API 66350 


Homo sapiens 


ST7 protein 


4711 


99 


919 


Y87285 


Homo sapiens 


Human signal peptide containing protein HSPP- 
62 SEQ ID NO: 62. 


430 


100 


920 


Y36131 


Homo sapiens 


Human secreted protein #3. 


465 


88 


921 


AF 193766 


Homo sapiens 


cytokine-like protein C17 


724 


100 


922 


Y95013 


Homo sapiens 


Human secreted protein vc48 1, SEQ ID NO :66. 


357 


100 


923 


X75208 


Hnmn «ainien<; 

L l\Jl LI W ^OLTIWIIJ 


protein tyrosine kinsse-receptor 


5256 


100 


924 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding protein, Y2H56. 


813 


98 


925 


AB039886 


Homo sapiens 


down-regulated in gastric cancer 


785 


78 


926 


G03368 


Wntnft aniens 


Human secreted protein, SEQ ID NO: 7449. 


55 


50 


927 


Y48606 


Homo sapiens 


Human breast tumour-associated protein 67. 


539 


100 


928 


Y36151 


Homo sapiens 


Human secreted protein #23. 


668 


100 


929 


AF1 103QQ 


T-Torn r\ cflniftn^ 




1666 


100 


930 


AF210317 


Homo sapiens 


facilitative glucose transporter family member 
GLUT9 


2763 


99 


931 


Y73328 


Homo sapiens 


HTRM clone 082843 protein sequence. 


931 


100 


932 


G01959 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6040. 


274 


100 






nuniu sapiens 




1469 


100 


934 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7908. 


529 


93 


935 






mifnehrtriririal ARC transnnrter 3 


196 


63 


934 


X56385 


Can is 

fsimiliari^ 


rab8 


1064 


100 


937 


B08906 


Homo ^anien^ 


Human secreted Drotein secuence encoded bv 
gene 16 SEQ ID NO:63. 


1 17 


44 


938 


Ml 3692 


Homo sapiens 


alpha- J acid glycoprotein precursor 


1064 


99 


939 


Y53886 




A ^iinnrr^sor of cvtokinc ^itrnallinp nrotein 

designated HSCOP-6. 


515 


42 


940 


Y16630 


Homo sapiens 


Human Putative Adrenomedullin Receptor 


1904 


99 


941 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A member 
24 


627 


99 


942 


M 12886 


Homo sapiens 


T-cell receptor beta chain 


1289 


81 


943 


AF226046 


Homo sapiens 


GK003 


1049 


98 


944 


Y36078 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 463. 


667 


100 


945 


M22877 


Homo sapiens 


cytochrome c 


565 


100 


946 


W67869 


Homo sapiens 


Human secreted protein encoded by gene 63 
clone HHGDB72. 


551 


93 


947 


W67859 


Homo sapiens 


Human secreted protein encoded by gene 53 
clone HBMCL41. 


283 


100 


948 


W85726 


Homo sapiens 


Novel protein (Clone BG33_7). 


789 


100 


949 


AJ242015 


Homo sapiens 


eMDC II protein 


4236 


100 


950 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8156. 


567 


99 



124 
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NO 


Accession 


Species 


L*rC54*ri p Li on 


Smith- 
Waterman 

Score 


% 

Ifipnf itv 


951 


AF 110645 


Homo sapiens 


candidate tumor suppressor p33 ING1 homolog 


1314 


100 




Y361 1 1 


Uftmn Qflni^TlQ 


Rvt^nHed human secreted nrotein ^eouence SEO 
ID NO. 496. 


402 


70 


953 


AR012I09 


Homo SfinirnS 


APC10 


990 


100 


954 


AF246221 


Mo mo sapiens 


transmembrane protein BRJ 


1405 


100 


955 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1883 


100 


956 


W74726 


Hnmo ^aniens 


Human secreted Drotein fs949 3 


1879 


100 


957 


Y27096 


Homo sapiens 


Human viral receptor protein (ACVRP). 


1581 


100 








v ■ n n q 1 ti 
^ y d L| 2* i 11 


1920 


100 




I .),?V/-J<£ 




T— Ti imgn tu»r*rpfpH orotf^in clone tH"202. ^ nrnteiti 

IlUlilCUl ^MflvlVU V V1UUV Uli*Vi< J ^Jl V will 

^eauence SEO ID NO* 1 1 0 


587 


100 




G02694 


iiv/iuu .xipii^ij^ 


Human secreted nrotein SEO ID NO" 6775 

1 fill | JuJJ k*L\/U \J I \J ^ ^ Li ly *JJOV^ A J ^ V • VI t *J ■ 


283 


100 


961 


API 




nrntem 


1214 


96 






nviny aapicii;* 


rllflh**fi i, 3 mpHitira tvnt* T niitfinntippn 


250 


65 


?OJ 




riUIHU dap 1C J Id 


HTa17^R7 9 ^nnvel nrntr*m1 


3796 


100 




r\ r U / 00J7 


nUlTlO SapiCll? 


PTD004 


2089 


100 


965 


AB020315 


Homo sapiens 


homologue of mouse dkk-1 gcnc:Acc# 


1466 


100 


7OO 




numu dopicjia 


nr^nircnr TinJvn^nf irf ^ f AA tn 1 f R "S^ \ 

pibLiUlaUl LrVJJJf pCpLlU** I^A/l ^ ^« I 1UJ j 


6580 


99 


70 / 




rtoiiio sapiens 


n f>no tr^i - *! 1 * ill 1 1 nr pflrnnnmfl snt'i cr^il o^ri p* ^ / M 
JlCpElOCClIlilal UUI UI1 UJIIlil rtUU^Cll gCllC JZU 




99 


y&o 


At U / i UtW 


Homo sapiens 




632" 


1 w 


7U7 


Ann? 1 T'5'7 


ntjrrtu btipicns 


in^inurtujc ijrp^ J niQiriiV inciwivpn^i^iiiadA' 




100 


970 


API 80920 


Homo sapiens 


cyclin L ania-6a 


1579 


100 


971 


Ar IUjJoD 


Homo sapiens 


cotransporter 


jOZI 




972 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


100 


973 


AJ 13 2429 


Homo sapiens 


hypcrpolarization-activated cyclic nucleotide 
gated cation channel hHCN4 




too 


974 


W61619 


Homo sapiens 


Clone H J rfcr 80 01 lM4ar sup-ertamily. 




1UU 


975 


API 55 100 


Homo sapiens 


zinc finger protein NY-REN-2 1 antigen 


2261 


100 


976 


AP275948 


Homo sapiens 


ABCA1 


1 1763 


9V 


977 


AB026891 


Homo sapiens 


cystine/glutamate transporter 


ICO 


100 


978 


AF1 17657 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP80 


11 AQ 

i34S 


yy 


979 


AF 044201 


Rattus 
norvegicus 


neural membrane protein 35; NMP35 


157U 


92 


QQft 

ysu 


Aril n fOT 

Ar I lyzV / 


Homo sapiens 


7> '■ ^ '• Fr 1 1 

neuroendocrine- specific protein- like protein I 


1 1 / u 


OO 

77 


981 


AF 155652 


Homo sapiens 


potassium channel modulatory factor 


1983 


99 


982 




Homo sapiens 


Human stomach carcinoma clone rlrlU4l^- 
encoded protein. 






983 


Z56281 


Homo sapiens 


interferon regulatory factor 3 


2012 


98 


984 


AJo UJ.01 2j> 


Homo sapiens 


A T> T A 

AK1-4 


Z 1 Oil 






\/ 1 A A 

Y 144S2 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 17. 


1 TO 




ytso 


ABU2J0SH 


Homo sapiens 


b-chemokine receptor CCR4 


1 SQS 
1 07J 




DOT 


/.zy 1 


Homo sapiens 


Human HI 075-1 secreted protein 5* end. 


/it 


1VAI 


yoo 


Ai" 1 J j4jU 


Manduca 
scxta 


juvenile hormone esterase binding protein 






989 


G03697 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7778. 


194 


88 


000 
yyu 


a n aa 1 <o 


Homo sapiens 


potsssium Isr^c conducts/ice cfllcium-sctiv&fccd 


l AHA 


inn 


991 


G02061 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6142. 


558 


99 






w> acn uuiflou 1 11 
s clegans 


wui OAT? 1 


327 




QQ'l 
77J 


I DO /IT 


nUUiU bdpi&Ila 


ivi giu u ion c-iwujui pruicin sr r^\j l i £ t . 


4730 


99 


994 


GO 1246 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5327. 


141 


77 


995 


AF 133845 


Homo sapiens 


corin 


5811 


99 


996 


AF 117756 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP 1 50 


4999 


100 


997 


W62066 


Homo sapiens 


Human stem cell antigen 2. 


284 


93 


998 


Y87173 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:21Z 


725 


100 


999 


Y 13379 


Homo sapiens 


Amino acid sequence of protein PR0263. 


1654 


99 


1000 


Y95008 


Homo sapiens 


Human secreted protein vf3_l, SEQ ID NO:56. 


676 


47 


1001 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1747 


100 
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i nrv"> 




rxorno sapiens 


Unman ?*rn»tarl nrnt^in CCD TT1 "MO- 1 *i 

numau scvrcieu prQicm, jty iu r<iKj. jju. 


398 ] 96 


1003 


V/73420 


Homo sapiens 


Human secreted protein encoded by Gene No. 


2150 1 100 


i(\(\a 


v 1 9*7Q1 


noino bspidii 


10t-n *!R P-nrnt(«in MA 1 . 1 AA\ 
i yKU oi\r-yiULcui i i j*aj*i a - i t*rj 


742 


100 


i t\t\<. 

IWj 


(VLZ.J JZJ 


Homo sapiens 


membrane protein 


642 


100 


1 Wo 


AOJ /40 


Homo sapiens 


MJr '' recepior 


326 


98 


i nm 

1VJU / 


I JJ77 / 


nomo sapiens 


jzjvicnucu nuinaii dc^ieicu piuicui sci{ucik.c, ocy 
ID NO. 382. 


824 | 99 


i firm 




I— Ivy 1 nKat^o 

riy locates 

ITIOjOCJ] 


dopamine receptor D4 




"is 






Homo sapiens 


— -j —. A AU 

Human secreted protein sequence encoded by 

gene Ol oLy iXJ iH\J .jjj. 


mi 


99 


1 U 1U 


AT llfilT-S 


no mo sapiens 


tij juhd i ^ikjvci pruicin^ 








Uv J /J J 


Homo sapiens 


Wiiman cM-rrh'rl nrrtt^i'n QF^t TD NfY 7fi14 


i /y 


Oft 




VI 7-sl 1 

11/ J J 1 


numo sapiens 


numau scvreicu pruicm ciunc ijijj_u J i*t pruLcm. 


oio 


yv 


1ft 1 1 


norm 4 


noino sapiens 


U lltnj kii cN>n>tHl nnnMin CP O IPl "MO- /ISO'S 

numau sccrcicu proicm, 3Dy iu iiu. iouj. 


4oZ 


IUU 


1014 


AF288092 


NaegJeria 
gruberi 


haem lyase 


1 14 


37 


1 v 1 J 


ADfVl <;0(Y1 


Homo Sapiens 


.VI o j protein 


3867 


99 




V 1 <5Q/lf* 


Homo sapiens 


ribosomal protein L31 (AA 1-125) 


644 


100 


(flit 

1017 


yy4o7j 


Homo sapiens 


Human protein clone MruzoJz. 


1876 


100 


1018 


AL024498 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


100 


1019 


X83425 


Homo sapiens 


Lutheran blood group glycoprotein 


3054 


99 


1020 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


1021 


G03960 


Homo sapiens 


Human secreted protein, s>ht^ ID NU. 8U4I. 


398 


100 


1022 


Y91689 


Homo sapiens 


Human secreted protein sequence encoded by 
gene yj ofc^ jlu in^j.joji. 


768 


100 


1 A"»1 

WZ5 




— : 

Homo sapiens 


nAlJ VJOol 


573 


100 


1024 


Ar 132905 


Homo sapiens 


CO 1-3 1 protein 


1550 


100 


1025 


W92380 


Homo sapiens 


Human TR-interacting protein S103a. 


1466 


97 


1026 


R66278 


Homo sapiens 


Therapeutic polypeptide from glioblastoma cell 
line. 


830 


100 


1027 


X65614 


Homo sapiens 


SI OOP calcium-binding protein 


476 


100 


1028 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


1029 


AJ001014 


Homo sapiens 


RAMP 1 


806 


100 


1030 


W63682 


Homo sapiens 


Human secreted protein 2. 


1354 


99 


1031 


AK 023007 


Homo sapiens 


unnamed protein product 


766 


100 


1032 


W97900 


Homo sapiens 


Human SR-BI class B scavenger. 


2672 


99 


1033 


Y82453 


Homo sapiens 


Human TGC-440 secretory protein SEQ ID 
NO:l. 


639 


99 


1034 


Y73473 


Homo sapiens 


Human secreted protein clone ydl78_I protein 
sequence 1L> NU:tb8. 


752 


93 




I B0405 


— = : 

Homo sapiens 


Human gene 48-cncodcd protein fragment, SEQ 


96 


90 


1UJO 


uuyou 


riOlTlU aapiCllS 


ILULUCllUIlLUulJ r\ 1 XT a^UUlaaC aUUUllll 7 prccuraui 


698 


100 


1037 


AJ242832 


Homo sapiens 


calpain 


3699 


99 


1038 


X66403 


Homo sapiens 


acetylcholine receptor epsilon subunit CHRNE 


2574 


100 




AJZ4Z/JU 


Homo sapiens 


polyhomeotic 2 


1310 


100 


1 f\Af\ 

1 U4U ■ 


A C 1 HOCX O 


Mus 

musculus 


uotA oinaing proiein ucok i 


1453 


80 


i riA i 
JU41 


ADZ^O.5 


Bos taurus 


perm ability increasing protein 


383 


29 


1U4Z 


vjUUJoo 


Homo sapiens 


riuman secrcico protein, ocv^ iu inu, t*t*\y~f. 


75 


50 


1043 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 


60 


53 


1U44 


M94582 


Homo sapiens 


interleukin 8 receptor B 


1850 


100 


1 yrtj 




Homo sapiens 


i ^similar 10 ivjf/vl#o ^lnsujin ilivc 
growth factor binding protein, acid labile 
subunit)) 


1704 


50 


1046 


AP125101 


Homo sapiens 


HSPC040 protein 


580 


100 


1047 


W74809 


Homo sapiens 


Human secreted protein encoded by gene 8 1 
clone HMWDN32. 


176 


100 


1048 


AL022238 


Homo sapiens 


dJ 1 042K 10.4 (novel protein) 


2201 


100 


1049 


W88667 


Homo sapiens 


Secreted protein encoded by gene 1 34 clone 
HA1BP89. 


1559 


99 


1050 


AF0975I8 


Homo sapiens 


liver-specific transporter 


2820 


100 
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Smith- 
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ID 
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Waterman 


Identity 


NO: 








Score 




i n^ i 


W /5Jzt 


jtioino sapiens 


rnigmeni oi nuruan setrcieti pruieiii cnuouea oy 


1 J l o 


QS 








gene o j . 






i n«i 
1032 


V1 1 Q< 1 
I Z 1 OJ 1 


- : 

Homo sapiens 


Human signal peptide-contianing protein (S1GP) 


1 AA1 










(finm. in lilt 1 1A\ 
\C\OV\C 1LJ ZJZOlJ'tJ. 






1033 


ATI A.1 fl I * 


— - — ; 

Arabidopsis 


putative protein 


OOl 


















I fOZUv/ 


nomo sapiens 


n urn mi sccrcicu protein encuueu oy gene / /. 




i nn 


1 fYS^ 


rYJZ (OjO/ 


Homo sapiens 


TP 1 H-l ilf^ Rhrt (TTPnv 
Hjl U-l UtC IvDO vj 1 1 acrZ 


1 1 f£S 
1 1 Ou 


i nn 


1UDO 


I .£ /OiU 


noniu sapiens 


nurnan seLrcLcu pruLcin encuueu uy gene nu. 


1 JH 




1UD / 


ni d^in 


numo sapiens 


riDosomai pruicio 


745 


i nn 




AT I JiUUVJ 


Homo sapiens 


xajjai protein 




inn 


i n«i 


AT ni 1 


Homo sapiens 


rHl i4 D ") 1 t /nrw^l D 71? D f'kvn'Fiuli'avanmiB 

□Jj4BZ1.i ^novci tjj6K-r ^Dcnzociiazapine 




l nn 

1UU 








receptor (penpnerai) ^MJUK, rBK, rtirvb, Jiir, 












Isoquinoline-bindtng protein)) LIK.H protein) 






1060 


A Tin 11 c 

Ar 22/133 


— — : 

Homo sapiens 


candiuote taste receptor l zK9 


1 1A 


33 


1 n^ 1 ' 

1061 


Y275/.> 


Homo sapiens 


Human secreted protein encoded by gene No. 9. 




i nn 
100 


1062 


Zl 1697 


Homo sapiens 


HB15 


1088 


100 


1063 


Ar 123757 


Homo sapiens 


putative transmembrane protein 


oiy 


100 


1064 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell protein 


2932 


99 


1065 


Y41674 


Homo sapiens 


Human channel-related molecule HCRM-2. 


936 


99 


1066 


AJ250042 


Homo sapiens 


Rab5 GDP/GTP exchange factor homologue 


2575 


100 


1067 


Y36087 


Homo sapiens 


Extended human secreted protein sequence, SEQ 


770 


85 








IL> NO. 4/2. 






1068 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 


30] 


100 








sequence afcy IU NO: 124. 






1069 


vn>i ncfi 

Y94939 


Homo sapiens 


Human secreted protein clone mc300_l protein 


int 
301 


i nn 
100 




.. .... . , 


sequence aty IL> NO: 124. 






1070 


W64535 


. 

Homo sapiens 


Human leukocyte ceil clone rirOUS04 proteirt 


on i >t 
Z014 


nn 
99 


JO/J 


A03 J 43 


Homo sapiens 


/* , it>17 TIT 

pot. UXr ill 


1 AV 

I ft 


<n 

JU 


1072 


AL031177 


Homo sapiens 


dJ889MJ5.3 (novel protein) 


821 


91 


1073 


Xs22oU 


Homo sapiens 


gpStafSO 


24y 


62 


i n*i A 

1074 


O03213 


Homo sapiens 


Human secreted protein, btv2 li- 5 NO. 72 y4. 


nn 

99 


A 1 

47 


1075 


Y36233 


Homo sapiens 


Human secreted protein encoded by gene 1 0. 


506 


55 


1076 


G03187 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7268. 


424 


98 


1077 


L258yy 


Homo sapiens 


ribosomal protein L10 


332 


76 


1078 


Y91447 


Homo sapiens 


Human secreted protein sequence encoded by 


898 


97 








gene 48 I>n.Q 11J NO. loo. 






n 

1079 


r~- r-i ] ojcT 


Homo sapiens 


Human secreted protein, Jsjfcv,! LU NO. jy4j. 


*7Qn 
Z90 


on 

89 


i nen 


A DftTOTil 

a±joj9 /^j 


Homo sapiens 


wni receptor iri22ueo- j 


1 


yz 


i not 
l US l 


Atmin^^"? 


Homo sapiens 


Na/PCM cotransporter homolog 




inn 


1 no 


L.13SUZ 


Homo sapiens 


ribosmal protein small subunit 




fin 
OU 


1 1W1 

LVoj 




Homo sapiens 


Human secreted protein encoded by gene 42 


It J 


51 1 
a 1 








clone HaAtiizj. 










— — : 

Homo sapiens 


numaD sccrcico pruicui, dlv invj. /o*td. 


OJ 


< 1 

J I 


lOoD 




Homo sapiens 


tiuman secreicu proxcm, invj. omi. 


KS 
Oo 


tj 


i no/; 


Ar UVUV4x 


Homo sapiens 


.rKtJUtO / 


1 Z4 


04 


1 fUl"7 
1UB / 


OOUD 1 / 


Homo sapiens 


Human secretea protein, aii^ uj zno. *+jyo. 


1 


A 1 
41 


lUoo 


O04UV I 


Homo sapiens 


Human secreted protein, aty iu no. oJ /2. 


IZO 


3D 


lUSV 




Homo sapiens 


G-protein coupled receptor 14 






i non 


O04UO3 


Homo sapiens 


Human secreted protein, s^bi^ jjj no. oJ14. 


1 1 A 




iuy 1 


o *70 1 n,* 
a f Z3U4 


Mus sp. 


LM w (j-protcin 


140 


HJ 




W8570o 


Homo sapiens 


Secreted protein encoded by gene 175 clone 


Arts 
4Uj 


i nn 








rtcrviftjvi £ t i . 






ivy 3 


W856 12 


... 

Homo sapiens 


becreted protein clone aiiZ5_p. 


1358 


Vf 




I JjUl £. 


Homo sapiens 


Human secreted protein clone pm5 14 4 protein 


mil 


OQ 

77 








sequence SEQ ID NO:30. 






1095 


Y92345 


Homo sapiens 


Human cancer associated antigen precursor from 


409 


100 








clone NY-REN-62. 






1096 


AF090942 


Homo sapiens 


PRO0657 


147 


60 


1097 


L24521 


Homo sapiens 


transformation-related protein 


166 


58 


1098 


X 5693 2 


Homo sapiens 


23 kD highly basic protein 


490 


70 


1099 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


83 


35 


1100 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 


149 


59 








clone HTDAD22. 
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1101 


AF 119851 


Homo sapiens 


PRO 1722 


183 


72 


1102 


G04086 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8167. 


207 


62 


1 103 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


91 


52 


1104 


X74856 


Mus 


ribosomal protein L28 


128 


69 






musculus 








1 105 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


130 


62 


1 106 


G03133 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7214. 


122 


48 


1107 


G 03 040 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7121. 


69 


43 


1108 


AP039942 


Homo sapiens 


HCF-binding transcription factor Zhangfej 


744 


99 


1 109 


AF201951 


Homo sapiens 


high affinity immunoglobulin epsilon receptor 


738 


94 








beta subunit 






1 110 


AFl 1 1 1 OS 

*0 AAA 4 W 




transient receptor potential 2 


223 


79 






musculus 








1111 


AFl 19900 


Homo sapiens 


PR02822 


144 


59 


1112 


Y 16589 


Homo sapiens 


A protein that interacts with presenilins. 


265 


39 


1 1 13 


G02872 


Hfimo saniens 


Human secreted protein, SEQ ID NO: 6953. 


178 


67 


1 1 14 


Y02999 


HniiM ^aniens 


Frapment nf human secreted nrntein encoded hv 


164 


63 








gene 121. 






1 1 15 


Y30811 


Hnmn ^aniftn^ 


Human secreted protein encoded from gene 1 . 


1217 


99 


1 1 16 


X51394 


Xenopus 


APEG precursor protein 


130 


40 






laevis 








1 1 17 


M27826 


Homo sapiens 


neutral protease large subunit 


442 


65 


1118 


G03371 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


72 


60 


1119 


G03602 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7683. 


491 


97 


1 120 


Y35906 


IJLPlIlv dO^Jlblid 


Pvt^fiH^rl human ^^CTpfpiA nrntein c^^/iii^nf^^ ^Fvf"l 

UAL^llUbU 1 J till lOi 1 jCvl ^LVU JV^UVllvV, 


244 


97 








id no. 155. 






1 121 


G03714 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7795. 


122 


65 


1 122 


Y00337 


T ifnm f\ eft rv i/*n q 

JllSIllv 3llj/<bJlJ 


I-Tumsin ^^/*r^t^ri rtrot^in cnorwH/v) hv &^n^ Si 


no 


90 


I J 21 

I I w 


AF0S4S10 




fu/fi r\r\rf* remain rh^nn^l* *TA^T£»2 


701 


Q4 

v*-t 


1 124 






m pmhranp intf*rr»rti rxu nrot^i n nf R frS I 

II Ifcrl 1 1 Ul Ol Iv-i 111 l.\~X CILI 1 1» K Ul w L&lll Ul <--> A U 


442 


88 


1 1 £D 




jiui no sapiens 


I4iirndn Q.t*/*tw*t&f\ nr^tf^m r^lnn^ |^\1/^Q^ j 

n u in an sccrcica pruicin irom tp-junc izfj jl. 


1Q1 


j j 


1 1 1f, 


L(UI Jul 


Homo sapiens 


ri urn ail secret ca p rote in, atzx^ ilf nw. js-^f*. 


1 sd. 


inn 


1 1 97 
1 ia / 


UUlJDI 


Homo sapiens 


U ttman C A rn >tf>rl nrr*t*»m QFO TPl "KT/'V 

rruinaii iccrcicu prutcin, jcy llj in\j . 


i fi*; 

1 DJ 


inn 


t !"?5t 
1 l,Zo 




Homo sapiens 


Flurn&n cEtfcJiovftsciiEstr system dssoci&tcd protein 


JIT ^ 


QO 








n A n a 1 






1 xAy 




I IOlHU sapiens 


Uiimnri vt*rrftrt^ nrntrin ^TPO TP> >JO- /^IRA 




/ j 






Homo sapiens 


TrEJi sm em b ran c domain containing protein clone 


inn 


inn 








HP01512. 






1 131 


Y29817 


Wfnno cflttipnc 

X X.\JXll\J JQUX^I 19 


Human cvn ?i n*;/* rf*lf»tpH olv^rmtnt^in 0 


260 


91 


11 32 






Humfln CP^iY^t^rl Ttrnf^iri Qfviii^rir*^ ^n^nrl^H hv 


J x\*> 


Q6 








ffene 43 SFO FDNO-317 






1 133 


Y91449 


Hnmn osinipn^ 


T-Tiimnn >5prrr*tpH nrntein ^ftniipnrj* i^n r*ofifv1 hv 


542 


100 








gene 49 SEQ ED NO: 1 70. 






1 134 


ABO 17908 


X MWJXXIV SAL'l^J 1 J 




2399 


93 


1135 


X51760 


Hnmn «niens 


/inc finder nrotrin f^RI AA^ 


312 


55 


1 136 


Y99426 


Homo sanien<; 


Human PRO 1604 (UNQ785) amino acid 


917 


72 








sequence SEQ ID NO:308. 






1137 


G03790 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7871. 


102 


50 


1 138 


AF155106 


Hnmn iflni^n^ 


NY-REN-36 antisen 


768 


91 


1 139 


AL031055 




rtI2RW?0 1 fnnvH nrntein cimilnrtn mrmhrnne 

uj&di xt*\f. i ijiuvu iiiuu^iii siiiuiai llj 11 |L^l 1 1 1J1 (11 I\* 


117 


50 








transport proteins) 






1140 


AF011359 


Bos taurus 


regulator of G-protein signaling 7 


138 


96 


1141 


Y70018 


Homo sapiens 


Human Protease and associated protein- 12 


623 


100 








(PPRG-12). 






1142 


G04091 


Homo sapiens 


Human secreted protein, SEQ ID NO; 8172. 


113 


38 


1143 


AB030235 


Canis 


D4 dopamine receptor 


89 


48 






familiaris 








1144 


Y94922 


Homo sapiens 


Human secreted protein clone pv6_I protein 


539 


88 








sequence SEQ ID NO: 50. 






1145 


X99962 


Homo sapiens 


rab-relaied GTP-binding protein 


398 


96 


1146 


G03807 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7888. 


168 


79 


1147 


G03712 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7793. 


512 


85 


1148 


Y28279 


Homo sapiens 


Human G-protein coupled receptor GRJR-1. 


705 


76 


1149 


U13642 


Caenorhabditi 


exon 5 similar to transmembrane domain of S. 


247 


36 



128 
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Accession 
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Smith- 
Waterman 
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% 

Identity 






s elegans 


cerevisiae zinc resistance protein 






1 150 


G03438 


Homo sapiens 


Human secreted protein, SEQ ID NO: 75 1 9. 


117 


62 


1151 


GO 1003 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5084. 


181, 


80 


1152 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


198 


63 


1153 


X 88 799 


Oryza saliva 


DNA binding protein 


95 


41 


1154 


D85245 


Homo sapiens 


TR3beta 


155 


96 


1 155 


R74272 


Homo sapiens 


Tumour suppressor protein, p53. 


341 


87 


1 156 


Y86265 


Homo sapiens 


Human secreted protein HUSXB77, SEQ ID 
NO: 180. 


99 


41 


1 157 


G02577 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6658. 


263 


98 


] 158 


AF 104334 


Homo sapiens 


putative organic anion transporter 


185 


42 


1 159 


GO 1393 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5474. 


173 


57 


1160 


W75771 


Homo sapiens 


Human GTP binding protein APD08. 


224 


81 


! 161 




JTJLVJ1I1U dfl^lwlli} 


M-AftP2 nmtein 

1 ▼ 1 4»J_> U J W W 111 


410 


83 


1162 


W67816 


Homo sapiens 


Human secreted protein encoded by gene 10 


1156 


100 


1 1 %jj 


r\x J J70J 1 


nOHlU $apidK> 


PR0 1 779 


230 


70 


1 1 0*1 


Y a 1 ZJd. 


rLomo sapiens 


U"i irn rin clonal npnhHik ^rtrttflitl itltr r\t"flt/> i tl tJQPP_ 
rxU.1 HcLLl dlKllcU IJt^LfLJuC iwkJi lutlll 11 lg LJluidll nor r 

29 SEO ID NO -29 


1 13 


31 


I i SJ.J 


W64537 


IkJjtllSJ oily US 


Humnn ltw*r r*<"l I flnnp T-TPftl ) A$t nmfrin 

J. \ax J J rl 1 1 11 rbl Vvll ill V l 1. »\/ ui VLvlll- 


338 


82 


1 166 






HC6 


134 


64 


1167 


Y 14482 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 17* 


119 


51 


I 1 DO 




coli 


DppC. 


411 


90 


1 169 


R63783 


I JLk^r 1 J 1 1' aALS 1 ^ L J ■> 


TG0847 protein, 


344 


90 


1 170 
1 J / u 


Yd'5'274 




Pnmnn ^^fn^t^H nrrtf^in pncriHiVl frftm fffttip 1 5t 


478 


98 


1171 
j j / j 


UU4 i ~> » 


VTnmn *jflnir*n*i 

SA\JH1\J nJUjJJtillJj 


Mr 1 1 f) OOO antipeji 


347 


96 


11 72 






orgflnic anion transporter OAXP-B 


31 1 


67 


1173 


G00357 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4438. 


60 


52 


1 1 7/1 


r\B77 1 7 
L/o /ill 


Homo s&piens 


flimilar tt~\ human € 4 1 PBQ^aarh vatino 

nrote f A49ft69^ 

Lrivipwiiiiinyuuy j 


178 


59 


1 17^ 


LV1 U*T 1 1 V 




riHrwr»ma1 rtrntf*in 


391 


78 


1176 


R08330 


Homo sapiens 


Human EL-7 receptor clone H6. 


285 


67 


1 1 77 
11// 




JTOiiiu sapiens 


riuvjauj lieu protein iiii 


242 


72 


1 17R 




jnumu cio.piwii> 


riroftnip r*ntirfcn trnn*;nnrtr*r fnf^T"?^ 


276 


88 


1 1 79 


G03258 




Hnmnn <:errpteri nrntrin SRO ITlTsJO* 73^9 


155 


71 


1180 


GO 1207 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5288. 


282 


90 


1 1 si 


r\j I 0 J OJO 


i\a tlU 3 




249 


62 


1 182 




J1U111U 30>piVll3 


HSPC 1 76 


L38 


90 


1 1 HI 


G03789 




l-Tumitn ^rmteH nrotein ^FO IH) NO" 7870 


282 


66 


1184 


Y02671 


Homo sapiens 


Human secreted protein encoded by gene 22 
clone HMSJW18 

W1V1J\/ lllTtUJ ■ * 1 V- 


107 


71 


1 185 


G03797 


Horao sapiens 


Human secreted protein, SEQ ID NO: 7878. 


58 


69 


1 1 86 


G03564 


Hrimn tanien? 


Human secreted orotein. SfcO ID NO" 7645. 


1 18 


46 


1 187 


AB032905 


HvlrthntM 

concolor 


rlnnamine recertfnr D4 


96 


37 


1 188 


G00956 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5037. 


292 


78 


1 189 


G03258 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7339. 


178 


79 


1190 


G03361 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7442. 


324 


76 


1 191 


AFl 17755 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP230 


187 


70 


1192 


Y70455 


Homo sapiens 


Human membrane channel protein-5 (MECHP- 
5). 


202 


67 


1193 


GO3052 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7133. 


99 


42 


1194 


G02607 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6688. 


192 


76 


1195 


W29661 


Homo sapiens 


Homo sapiens CI542 2 clone secreted protein. 


2001 


98 


1196 


Y14104 


Homo sapiens 


Human GABAB receptor 1 d protein sequence. 


239 


69 


1197 


X61972 


Homo sapiens 


macropain summit iota 


149 


90 


1198 


GO0534 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4615. 


145 


51 


1199 


Y86260 


Homo sapiens 


Human secreted protein HELHN47, SEQ ID 
NO: 175. 


1089 


89 


1200 


G02607 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6688. 


154 


57 
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ofcQ 

TTi 
l±J 

NO: 


Accessi on 
rxo. 


Species 


ucscr 1 pi 1 on 


O. J J J U J 

TT 14 lull 

Score 


SO 


1201 


G00838 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4919. 


404 


50 


1202 


M27826 


Homo ^anient; 


neutral protease large subunit 


202 


49 


1203 


Y73424 


Homo sapiens 


Human secreted protein clone yi4_l protein 
sequence SEQ ED NO:70. 


265 


61 


1204 


AF264014 


Homo sapiens 


scavenger receptor cysteine-rich type 1 protein 


625 


98 


1205 


Y36203 


T-Tomn <ifinien* 

1 1VJ1IU WUlVUt? 


Human secreted protein #75. 


219 


59 


1206 


U78111 


Online online 


AO 


205 


57 


1207 


AF095448 


Homo sapiens 


putative G protein -coup led receptor 


416 


76 


1208 


AF116715 


Homo sapiens 


PR02829 


127 


75 




AT UIO* 13/ 


Homo sapiens 


MaxiK channel beta 2 subunit 


47<i 


7J 


1910 




nujiHJ sapiens 


H **r^^af r\r*j* Hitler r*a tv i Am i T*f* 1 ?i tf* H mttntiv^ tiimnr 


423 


79 


191 1 




nun iu Sapiens 


riurusn sctrcicu ptuicui vntuucu uy gene 
107. 


224 


70 


1212 


G00719 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4800. 


117 


44 






nuiiiu sapiens 


T~tnman c/*^r^tAH rtmt^m ^Ffl TO MO- 5HQO 


35 1 


73 


1214 


AF090942 


Homo sapiens 


PRO0657 


124 


70 




Y 1 44Z / 


Homo sapiens 


Human score tea protein encocca oy gene 1 / 
clone HSIEA14. 


oo 


77 


i "> i *; 




Homo sapiens 


iriuman sccreieu proiein, ocv^ id ihvj , /7<so. 


1 IX 

1 1 J 




i ^ t ^ 

1217 


Y5 /8y / 


Homo sapiens 


T T. -.in * -lit.-. i.m 1^ TL IT. _n .-. j- j-.t.ai 11 f V J "ft jf L> f\J Of 

Human transmembrane protein Hi MrTJ-zi. 


1 1 T1 


1 UU 


12 J O 


rnA l cm 


Homo sapiens 


hla-dr antigen alpha chain 




7ft 
to 


12 iy 


YDyvuy 


Homo sapiens 


oecretec protein /o-jio-j-A 1 z-ri> i . 


A1(\ 


Q7 


1220 


W815 /o 


Homo sapiens 


iiBV- induced Lr-protein coupled receptor itiii- 
2) polypeptide. 




ILMJ 


1221 


Wy6745 


Homo sapiens 


High affinity immunoglobulin E receptor- like 
protein (iiJtKJJj. 


ODU 


QO 

yo 


1222 


Y3591 1 


Homo sapiens 

: 1 


Extended human secreted protein sequence, SEQ 
LU l t>U. 


135 


31 


12.1.3 


VIAAT70 


Homo sapiens 


Human secreted protein encoded by gene 2 1 . 


ZOu 






Ar 1 o 1 4ZZ 


Homo sapiens 




JOB 




1225 


U 14970 


Homo sapiens 


ribosomal protein S5 


202 


95 


122o 


(jUI /.J S 


Homo sapiens 


Human secreted protein, axii^ jjj nu, jo14. 


a i n 

O 1 u 


i rvA 


1227 


Af 099973 


Mus 

musculus 


schlafen2 


333 


56 


1228 


G01218 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5299. 


155 


81 


1229 


Ar2 17188 


Mus 

musculus 


YIP IB 


on i 


OJ 


1 Tin 


*C1 Oil 

At 1 /OBI 3 


Homo sapiens 


— r~Ti — j n 

soluble adenylyl cyclase 




i nn 


191 1 
IjL3 1 


A70JJJ 


Homo sapiens 


organic cation transporter 


t 7fW. 
1 / U*V 




1232 


W74955 


Homo sapiens 


Human secreted protein encoded by gene 77 
clone riL'tAaA*. 


212 


53 


I 911 
1 «J 


I if**5r«HJ 


— T ; 

Homo sapiens 


Human secreted protein clone yi62 1 protein 

sequence SEQ ID NO:86. 




inn 


191d 


T 17AA1 R 
u /oo 1 O 


N^us 

miicnili ic 


XT U A P 


482 


82 


1235 


AF044924 






380 


97 


1236 


GO 1459 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5540. 


417 


100 




AFfVWll SI 






164 


84 


1 918 


W ooOjj 


nouKj sapiens 


occreica pruiciii encuueu uy gene iuu Liunc 
HE8EU04. 


^ju 


on 


1910 


W90r;/;n 

VY £7UUU 


noiuo sapiens 


noiuo sapiens ^rix / j ciunc sevrcicu pnjiein. 


697 


OR 


1240 


A\ I l./V^t 1 u t 


(~\ w f*t A 1 n oil c 

cuniculus 


LJCI UA1MJ1L JOi l^a^iiCUCIILiCIIL A4JILILC UUIIu 


154 


52 


1241 


Y92710 


Homo sapiens 


Human membrane-associated protein 2sig24. 


709 


97 


1242 


Y95002 


Homo sapiens 


Human secreted protein vc34_l, SEQ ID NO; 44. 


908 


88 


1243 


Y44905 


Homo sapiens 


Human potassium channel molecule ERG-LP2 
partial protein. 


325 


100 


1244 


AF284422 


Homo sapiens 


cation -chloride cotransporter-interacting protein 


511 


97 


1245 


Y53629 


Homo sapiens 


A bone marrow secreted protein designated 
BMS115. 


1888 


93 


1246 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 3 


389 


97 


1247 


Y35911 


Homo sapiens 


Extended human secreted protein sequence, SEQ 


168 


39 
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SEQ 


Accession 


Species 


Description 


Smith- 


% 


ID 


No. 






Waterman 


Identity 


NO; 








Score 










ID NO. 160. 






1248 


AF0725O9 


Rattus 


glutamate receptor interacting protein 2 


559 


90 






norvegicus 








1249 


AF247042 


Homo sapiens 


tandem pore domain potassium channel TRAAK 


661 


98 


1250 


B08974 


Homo sapiens 


Human secreted protein sequence encoded by 


1087 


97 








gene 27 SEQIDNO:131. 






1251 


L15313 


Caenorhabditi 


putative 


858 


59 






s clcgans 








1252 


Y29338 


Homo sapiens 


Human secreted protein clone it217_2 alternate 


278 


75 








reading frame protein. 






1253 


W0 1730 


Homo sapiens 


Human G-protein receptor HPRAJ70. 


211 


92 


1254 


G03074 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7155. 


294 


83 


1255 


G0I818 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5899. 


253 


91 


1256 


AF2 86368 


Homo sapiens 


eppin-1 


222 


54 


1257 


AF220264 


Homo sapiens 


MOST-1 


87 


93 


1258 


G02227 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6308. 


281 


78 


1259 


Y07970 


Homo sapiens 


Human secreted protein fragment #2 encoded 


8J 


94 








from gene 26. 






1260 


R95332 


Homo sapiens 


Tumor necrosis factor receptor I death domain 


986 


100 








Iigand (clone 3TW). 






1261 


AF 140674 


Homo sapiens 


zinc metalloprotease ADAMTS6 


172 


36 


1262 


U28369 


Homo sapiens 


semaphorin V 


237 


67 


1263 


Y07049 


Homo sapiens 


Renal cancer associated antigen precursor 


288 


71 








sequence. 






1264 


Y36153 


Homo sapiens 


Human secreted protein #25. 


187 


80 


1265 


Y78114 


Homo sapiens 


Human cytokine signal regulator CKSR-2 SEQ 


723 


93 








IDNO:2. 






1266 


Y 13397 


Homo sapiens 


Amino acid sequence of protein PR0334. 


191 


100 


1267 


AF030558 


Rattus 


phosphatidylinositol 5-phosphate 4-kinase 


859 


95 






norvegicus 


gamma 






1268 


U73167 


Homo sapiens 


candidate tumor suppressor gene LUCA-1 


159 


96 


1269 


AF 190664 


Mus 


LMBR2 


552 


76 






musculus 








1270 


AL050332 


Homo sapiens 


dJ570F3 . 1 (homolog of the rat synaptic ras 


820 


98 








GTPase-activating protein pl35 SynGAP) 






1271 


G02126 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6207. 


131 


95 


1272 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase isoform 


253 


92 


1273 


AL035661 


Homo sapiens 


dJ568Cl 1 .3 (novel AMP-bincfing enzyme 


1280 


100 








similar to acetyl-coenzyme A synthethase 












(acetate-coA ligase)) 






1274 


AF064748 


Mus 


S3-12 


3523 


61 






musculus 








1275 


D17554 


Homo sapiens 


TAXREB107 


377 


78 


1276 


Y30715 


Homo sapiens 


Amino acid sequence of a human secreted 


643 


90 








protein. 






1277 


AF 146760 


Homo sapiens 


septin 2-Iikc cell division control protein 


707 


100 


1278 


Y05069 


Homo sapiens 


Human PIGR-2 protein sequence. 


281 


46 


1279 


X59668 


Oryctolagus 


aorta CNG channel (rACNG) 


267 


85 






cuni cuius 








1280 


G0I051 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5132. 


489 


98 


1281 


G034H 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7492. 


120 


43 


1282 


AF055084 


Homo sapiens 


very large G-protein coupled receptor-1 


1635 


100 


1283 


AF117814 


Mus 


odd-skipped related 1 protein 


357 


98 






musculus 








1284 


U87318 


Xenopus 


NaDC-2 


535 


60 






laevis 








1285 


AF061346 


Mus 


Edpl protein 


452 


68 






musculus 








1286 


AB030182 


Mus 


contains transmembrane (TM) region 


582 


68 






musculus 








1287 


A13595 


synthetic 


imrnunosuppresive protein PP15 


185 


97 






construct 








128S 


AF254411 


Homo sapiens 


ser/arg-rich pre-mRNA splicing factor SR-AI 


837 


100 


1289 


AE084205 


Rattus 


serine/threonine protein kinase TAOl 


319 


98 






norvegicus 
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1 — nV;x 

SEQ 
ID 

NO; 


Accession 
No. 


Species 


Description 


amitn- 

Waterman 

Score 


Identity 


1290 


AF038563 


Homo sapiens 


membrane associated guanylate kinase 2 


523 


100 


1291 


Ar 03484 / 


Homo sapiens 


double- stranded RNA specific adenosine 
deaminase 




i nn 


1292 


KJ 1 COOO 

Ml jooo 


Bos taurus 


end ozepine- related protein precursor 


7J / 


0 / 


1293 


ABO 1 0692 


Arabidopsis 
tnaliana 


A 1 r-aependent KNA nelicase-liice protein 


DJO 


45 


1294 


AP209923 


Homo sapiens 


orphan G-protein coupled receptor 


1570 


100 


1295 


Wo/828 


Homo sapiens 


Human secreted protein encoded by gene 22 
clone HFEAJ41. 


5U4 


no 

90 


1296 


ACU04832 


Homo sapiens 


similar to 45 kDa secretory protein ; similar to 

f A A 1 f\4LA A 1 /Ttm-rwA 1 CA A 1 B"\ 

L.AAlUo44.1 (JrlD.g4Lo441o,J 


(LA Q 


£.K 

t)5 


1297 


AOUU35 


Oryctolagus 
cuni cuius 


cysteine rich hair keratin associated protein 


5/5 


/U 


1 TOO 

1298 


Uv2o45 


Homo sapiens 


Human secreted protein, oty lu inhj. 0/2.0. 


XX J 


Q7 


i ion 




Homo sapiens 


Human delta3 fragment #4. 


1ZZ 




1300 


W70504 


Homo sapiens 


Leukocyte seven times membrane-penetrating 
type receptor protein Jb(j 1 0. 


459 


81 


1 301 


T07313 


. 

Homo sapiens 


Human secreted protein BL89__13 amino acid 
sequence. 


jy lo 




I3U2 


M//o9i 


— = ; 

Homo sapiens 


speimidine/spermine Nl-acetyltransferase 


1 t 1 * 


yo 


1303 


001331 


Homo sapiens 


Human secreted protein, SEQ ID NO; 5412. 


254 


69 


J 304 


G01491 


Homo sapiens 


Human secreted protein, MiQ ID NO: 55/2. 


/4/ 




1305 


AP l 48509 


Homo sapiens 


alpha 1 ,2-mannosidase 


602 


98 


1306 


G01658 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5739. 


333 


98 


1307 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity modifying 
protein SEQ fD NO: I . 


332 


98 


1308 


AP033I20 


Homo sapiens 


p53 regulated PA26-T2 nuclear protein 


348 


52 


1309 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein sequence. 


147 


66 


1310 


AP063243 


Bos taurus 


ribosomal protein L30 


296 


90 


1311 


AF224494 


Mus 

musculus 


arsenite inducible RNA associated protein 


688 


70 


1312 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein sequence. 


1154 


100 


1313 


Y99419 


Homo sapiens 


Human PRO1780 (UNQ842) amino acid 
sequence SEQ ID NO:282. 


1145 


78 


1314 


AFH6667 


Homo sapiens 


PRO 1777 


433 


97 


1315 


W75100 


Homo sapiens 


Human secreted protein encoded by gene 44 
cionc HE8CJ26. 


807 


97 


1316 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


789 


100 


1317 


AB041533 


Homo sapiens 


sperm antigen 


2607 


98 


1318 


U19617 


Mus 

musculus 


Elf-1 


806 


92 


1319 


U82598 


Escherichia 
coli 


ferric enterobactin transport protein 


768 


100 


1320 


D90892 


Escherichia 
coli 


SORBITOL-o-PHOSPHAI b 2- 
DEHYDROGENASE (EC 1.1.1.140) 
(GLUCITOL-6- PHOSPHATE 

Utn T LJivvJVJC.rN/\oc- ) ^tvC 1 IJoilii riV^ior il.r\ 1 c 

REDUCTASE). 


709 


100 


1321 


Wo/84/ 


Homo sapiens 


Human secreted protein encoded by gene 4 1 

Clone tirrsK^i /4. 


cm 
oU 1 


<Yi 


1 

1.7 


JKJi /OIUI 


— — : 

Homo sapiens 


vjrivVjjD protein 




7J 




ftJJ. /OlUi 


no mo sapiens 


uriv^jD protein 




Q7 


1324 


Y58628 


Homo sapiens 


Protein regulating gene expression PRGE-21. 


1584 


100 






Ratrus 

norvegicus 


pyridoxine 5 -phosphate oxidase 


1 ITT 


fiy 








vsr\±jri-\*y mviiiruiiic uj lcuu^iuc lauiuiiii 


1606 


i no 


1327 


Y32206 


Homo sapiens 


Human receptor molecule (REC) encoded by 
Incyte clone 2825826. 


1531 


90 


1328 


AF 151048 


Homo sapiens 


HSPC214 


657 


85 


1329 


Y 10530 


Homo sapiens 


olfactory receptor 


1645 


100 


1330 


API 80681 


Homo sapiens 


guanine nucleotide exchange factor 


4314 


99 


1331 


AF1 11856 


Homo sapiens 


sodium dependent phosphate transporter i so form 
NaPi-3b 


3591 


99 


1332 


Y13583 


Homo sapiens 


G-protein coupled receptor 


2171 


100 


1333 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 
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NO: 


Accession 
Mn 


Species 


L/C 3V rip L 1 kjll 


Smtth- 
Watcrman 
Score 


% 

Identity 


1334 


Y25755 


Homo sapiens 


Hurnan secreted protein encoded from gene 45. 


1380 


96 


1335 


AF 152325 


Homo sapiens 


protocadherin gamma A5 


4742 


99 


tit*; 


X 74070 




+r*cinc/Tif\f ion ffiftor RT^Fi 


639 


81 


1337 


AF095927 


Rattus 


protein phosphatase 2C 


1931 


95 




G03877 




Human secreted nrntein. SFO ID NO" 7958 


621 


100 


1339 


AL008582 


Homo sapiens 


bK223H9.2 (ortholog of A. thaliana F23F1 .8) 


626 


100 


1 

1 iW 


AO IDI J 


numo ittpiciii 


1 0*1 iLr *i i\ tnri iliitf^rv fziftrtr rf*tf*^ntrtr* 

icuivcmiei iiiniDiujiy lawLur icvcjjiur 


5820 


99 


1 Tyi 1 




Homo sapiens 


A carcinogenesis- inhibiting protein. 


7S9R 


07 


1 1A1 
1 




Homo sapiens 


f*maTinliiTriin# L-irlQc^ 
SUlaliOlalllUlC KUla&C 


&j f 


100 


1J4J 




Rattus 
norvegicus 


GTP- binding protein 


1 1 an 

* 


Q7 


1344 




A rn In i H nn lie 

thaliana 


rmtativ^ nhAQnhArthrKvlfnrmvf civ cinarni dine 

synthase; 25509-29950 


3283 


51 


1345 


Y28576 


Homo sapiens 


Secreted peptide clone pe503_l. 


944 


100 


1346 


W747S7 


Homo sapiens 


Human secreted protein encoded by gene 58 
clone HHFHN61. 


] 171 


100 


1347 


M55542 


Homo sapiens 


guanylate binding protein isoform I 


2636 


87 


1348 


AF 183428 


Homo sapiens 


28.4 kDa protein 


1329 


100 


1349 


U70669 


Homo sapiens 


Fas-Hgand associated factor 3 


167 


24 


1350 


AF295530 


Homo sapiens 


cardiac voltage gated potassium channel 
modulatory subunit 


562 


99 



TABLE 3 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino.acid sequence (A-Alantne C=Cysteine, 
D=Aspartic Acid, E=G]utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
Q^GIutamine, R=Arginine, S=Serine, 
T=Threoninc, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


1 


1351 


A 


2 


337 


1 


TPSLIHQAPTF*CPAGLWG/PPNGHYHGS*PGC 
HWPQAPHRA* * * G LLPPR WLGHGLPGGP AAP 
WAASQWVDGVAGRLPGPAWSWHASGAAPA 
QPGPL*LLVPGSSGLPDPRDP 


2 


1352 


A 


27 


100 


366 


IRNSSIRPMKERETK1SAKHMITCSASYDIRGL 
QIETTAYHHTPiRMAKIQKT/GHHQC**ECGAT 
GTLIHG WWG CK WEPLGKTV WQIPK. 


3 


1353 


A 


40 


3 


314 


• HA S AHA S V V LKDN S ELE QQLG ATG A YRARA 
LELEAEVAEMRQMLQLEHPFVNGADKLRPD 
SMYVHLNEL*QSLVENMLLTVVDTH\RTPI*R 
SCNYTLALILFL 


4 


1354 


A 


74 


2 


292 


TASALFSCPDGGSLAGFAGRRASFHLECLKR 
QKDRGGDI SQKTVLPLHLVHHQV AHTFGQ AT 
VTCQQARQSPG*RTNPE/ALQWVLPVSDGWH 
VLPLP 


5 


1355 


A 


78 


114 


850 


ENCRVASNLPGVFFSEDTAQSGSYMRISAHPP 
N AGGE VSNGPKRKLTLMLNFSLPS SGLNAG A 
FYALSTLLNRMV1WHYPGEEVNAGRIGLTIVI 
AGMLGAVISGIWLDRSKTYKETTLVVYIMDT 
GGA WWCYTFYLGTGDTCG* CFTTAGNTMGFF 
MTGYLPLGFEFAVEIASYPESEGISSGLLNISA 
QVFGIIFTISQGQIIDNYGTKPGNTFLCVFLTLG 
AALTAF1KADLRRQKANKETLEN 


6 


1356 


A 


81 


97 


376 


EWFSYMLGSNMSVYHSP*SLEPLCKVLSES*A 
YLRVPFIRJLLNAR* JRKA YKRMSLEIKLLI/RE 
♦CLFQEMGLSLQWLYSARGDFFRATSRL 


7 


1357 


A 


93 


2 


872 


TLSSACLJGDAWKELTTVAGAVSNQLLVWYP 
ATALADNKPVAPDRRISGHVGIIFSMSYLESK 
GLLATASEDRSVRIWKGGDLRVPGGRVQN1G 
HCFGHSARVWQVKLLENYLISAGEDCVCLV 
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SEQ ID 
NO; of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
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Amino acid sequence (A— Alanine C^Cysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenylalanine, G-Glycine, H— Histidine, 
I=Iso leucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
l — inrcouinc, v — valine, w — irypiopnan, 
Y=Tyrosine f X=Unknown, *=Stop cod on, 
/"-possible nucleotide deletion, V-possible 
nucleotide insertion 














ii/ci_rpiT;Pir da FBfiHnriB aiTt a i a ahfrhawv 

V\ o ijLX^AJC 1 L^V^/^i ivVJ ri V^/V_J ivvj 1 iv.rVl il JC-rvv^ /\ V* V 

ITGGDDSG1RLWHLVGRGYRGLG/DLGSLLQ 
VP**ARYTOGrDSGWLL ATAGSD*YRGPV <: 5I 
* RRGQ VLGAAARG* TFPVLLPAGGSS WSRGL 
RIVCYGQWGRSCQGCPHQHSNCCCGPDPVS 
WEGAQLELGPAWL 


8 


1358 


A 


106 


3 


350 


FSSLLSGRISTLRDETGAILIDGDPAACAPIIKF 
LLTEELHLRG V S 1Y VLRHEAQIYGITPLAVC AL 
LI/CRRL*SDSCMRAALNDRGLYQVLILDGLV 
QCLGFVDSDSRKMVSTLT 


9 


1359 


A 


115 


49 


186 


QAWA1FKGKYKEGDTGGPAVWKTRT.RCALN 
KSSEFNEGPERERMDV 


10 


1360 


A 


123 


2 


1249 


KGCRTQEKVDRTEVIRTCINPVYSKLFTVDFY 

FEEVQRLRFEVHDISSNHNGLKEADFLGGME 

CTLGQiVSQRKLSK SLLKHGNTAGK S SITVTA 

EELSGNDDYVELAFNARKLDDKDFFSKSDPF 

LErFRMNDDATQQLVHRTEVVMNNLSPAWK 

SFKVSVNSLCSGDPDRRLKCrVWDWDSNGK 

HDFIGEFTSTFKEMRGAMEGKQVQWECINPK 

YKAKKKNYKNSGTVILhTLCKIHKMHSFLDYI 

MGGCQlQFTVAIDFTASNGDPRNSCSLHYTHP 

YQPNEYLKALVAVGEICQDYDSDKMFPAFGF 

GARJPPEYTDSHDFAJNFNEDNFECAGIQGW 

EAYQ SC FVPKAPTFTGPTN ICPHSSRKV AKFRR 

SEGN*HQGRAFAIIFILVDPGQVGVYSQDMGP 

DNPGGHFV 


11 


1361 


A 


147 


614 


9 


ACARKQLLGRTVFIWFVGQLLGGELKGYSKT 

NTTSSRPASSRG\TLSSSSSSSSSLTKDALPSSL 

KSDS 1 llTSGL VFFFRSLCVNPAKSS V SESVSSI 

KILLSSSVKYLE*KRTSCCFPDSSESKLSQLSS 

DERVSMGTSSRKPTNSSSSLGALKMSATS\*G 

SGSESPTPFFLTGLQSPPSTRPREPGLTTARNS 

TTLTRDC 


12 


1362 


A 


177 


12 


416 


LIPSEPALDSLVDPRVTlSRXQPFVlYPVYDTAI 
DTKIHFSLLDGNVGEPDMSAGFCPNHKAAM 
VLFLDRVYGIEVQDFLLHLLEGGFLPDLRAA 
ASLDT/AEIGAMDFLLS* LFTLCLMMFFFIYPFI 

VTI T TH A~KT\7~\S 

inLL i jvIjn V i 


13 


1363 


A 


249 


535 


105 


WTFHRHL SP APLIVCDQGTC V V S Y YPQNI VQ 
MPDTQMEQGLN/HLFLDGNATHSVECYCPS 
TFELAJKITSFVLYFHRYRAPEVLLRSSVYSSPI 

r»VWAV(^QT\jlAPT VVjTF DPI PPnT^FVmPTFlfir' 
\J V VY /\ V IJOHYl/vtlj I rVUL-Ivr L-r r VJ 1 Sty L/dLT rvlV^> 

QVLGTPKKVSTLVPKLL 


14 


1364 


A 


254 


572 


201 


YLLTXIGNLMMLLVINADSCLRTXM+FFLGH 
FFFLDICYSSVTAQDAAEFPVS*KPILVWGYrT 

TMVMNRPLCTATVNATNKMGFLNSQVN 


15 


1365 


A 


JLj 1 




Do 


THAVCl WVFMIPVT VfT PVT UVtWAlPTVM 

ArEFLLECDQNIT\KLICENT*KNIAKNI*KRRV 
TFTPIET* HP VKQMIK WQ» LTA WLRNRGYKKI 
KQTPNSETAPSVCRNLVFDKCG 


16 


1366 


A 


263 


104 


481 


FCEFRTTEEDRGGDDC W SVWTKQRNNSCV K. 
SKDWSKPVNIFWALEESVLGVKARQPKPFFA 
AGNTFEMTCKVSSKNIKSPRYSVLIMAEKPV 
GDLS SPNETKY1I SLDQDS VVKLENWTD ASRV 


17 


1367 


A 


298 


68 


208 


RKRTNNP1KLDKJCFEHFKNEDI * ITSKHTKM W 
VSSLAMKEMLTKTTM 


18 


1368 


A 


300 


904 


1 


LWGITGTRHHARVinFLVETGFPHVGQAGL 
ELLTSGDPPALASQSAGITGMSHCARPKGHFG 
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Amino acid sequence (A=Alanine C=Cystcinc, 
D-Aspartic Acid, E-Glutamic Acid* 
F=Phenylalanine, G=Grycine, H=Histidine ( 
I=IsoIeucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W-Tryptophan, 
Y-Tyrosinc, X-Unknown, *-Stop codon, 
/■^possible nucleotide deletion, V=passible 
nucleotide insertion 














IHLK* MFYTMSQKMP* PTINLILLLIIPGNLNrF 

KPNMGWLGPKTAFV*KDEVLSG1PFAKGRCR 

WK*DY*C/LQEVTDP1MEKGKKKKRTASFFK 

GQPHQSTNALLRRCVR*RYHLS\TVETAGLP* 

KNTGHIPGQPFLFKL VFKC '■NVICI * * QYK W* Q 

N1GVKNKSFCPH*SSSPSL*FIGHHSRNF/CSFK 

TEPHSVVQAGGQWRNLSSLQAPPPGLMPLSR 

ISLMSSWDYRRPPQ 


19 


1369 


A 


302 


3 


445 


NSPSRWAKIQMFEHTFCG* GCG/ER/NVHIHCS 
WICRLRPLL WRA VR K YLSKLKN AEL SFD PG V 
SLLRIYAIDMPTSI*DEKEALLFAFLAFHE*HC 
KSRIWAV1Q/CIHL WDWLRKL* CFHRMKFY A 
AV*NKPRHLLSHIWKDVQNILLK 


20 


1370 


A 


304 


1 


1339 


FFFCGKEVPLFEQNKHPGPRATTSPGA/HARA 

LLS AGEFT AG VGLSP* AIHSF V WLCTFIQHG A 

GGPCHQPGGSPGPWMHTTQAGHLWEGAYPG 

GSSTWHQVPGQLGGSWGPRERSLLGSFIKCSP 

CPHPPGFRLWMSPNQKPPTENPGVMGRVWR 

LMPGESPLIWEAEGKEDHLSPEGQGHSE/PVA 

PLHS SLGNTVKP* PKNQKPKQNRSRHQ Q\GF 

MAGQGQSRPAAR*PPCPALTPASHSAGTWPP 

RICRTVPGGPCPSPSGFRSCRR*GFSA*TRSWP 

DAEPPSTPDTAPRCCTQSDTSSQGPQ*S* WRR 

CRALPGRLCSAPAAGLRRARPRLSESRRGNSP 

PASPAAASARCPSWGPSCPARPPSRPAAGTEP 

AAPSRCTAWLRGEREPGPRPPGRRPRSGRGP 

VSFAPEVLSLPAVRQTKSWRWRNEEErrRPW 

ALVRSRGG 


21 


1371 


A 


326 


799 


1587 


GSQVLPFPJPSQDSATLPQDA*GPRAAPGQPVC 
E*GLQGAGVRRLRGEVLCQPQP+GAL*EQCLP 
HLSFSPRQGAAPDTEPSAWGPAPTGATGPGLP 
LRHVRLF S AGAPRGAATPCPPALLH GPA WPP 
ARPMFRGHPPVRPLGPWGKVAAGPRALCLA 
GVP A VQGEC ATKPS G * GL* P AHLRGPPGPEVL 
Q WHWQLS AGRDPVPAEDPPL* EGPLGPGGPA 
AAQAEPG ADPEPEDKDQ AAESRPAG AMSL SA 
QGSGPVGGQGLR 


22 


1372 


A 


327 


146 


652 


PHLENPHPEHSFPGAPLT*STLSWSILSPR£PSP 

GAPCYPGHPHLENPHLEHLLTft r RTVTWSTLL 

PG APC YPEHPHLEHPLTWSTPHLEHP SPGEPL 

SCRTPTRSILHRDHPLP*CLSTEESPI*GWGSLP 

APPSTPLVLDVAPPGPQPASSCPGRDSCYSVP 

GTWSP 


23 


1373 


A 


348 


397 


2 


C1VSSCQGTRKPCHLEDANKINKQSPTLEKJES 
LQESL* VKQ* LI V AEK YVQILHPRKK Y FQRPL 
TSnSlEKRKMKKRK£EKKKCRERMQRRSKWRR 
EEKKE* RREEVEERKXEKEDRKERRKETSPRG 
SRRLLRD 


24 


1374 


A 


362 


170 


352 


GRALDTAAGSPVQTAHGLPSDALAPLDDSMP 
WiiCjKl I WbLHKKKJiLAKl Li.VbKvK.OPQ 


25 


1375 


A 


384 


373 


128 


YLITTILETGYLWKNRHSDQ+KRTENPERDQH 
KYPKVDFCKSNSMKNRLCNKWHWTNWrrrD 
KKINLNLKPHTKLTPNIKKN 


26 


1376 


A 


397 


383 


165 


E VKNTNPFrFSGTNLTT WIRSI * RKSDE IN QRTK 

♦MEKYSISLDRRLNTVKMSFLPNLIYKFNTISI 

K1PANF 


27 


1377 


A 


406 


103 


380 


KSKATGYMVN1"KL1V\FLYANDEQLEIEMNK 
rVP\FNGSKNKWTNLTKYQNIQNRHAENYKI 
LVNKIEDLNKWRNVLLSWIGRRNIINTMT 
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Amino acid sequence (A=Alanine O-Cysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
F=Phcnylalanine, G^31ycinc, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, **=Stop codon, 
/^possible nucleotide deletion, impossible 
nucleotide insertion 


28 


1378 


A 


408 


14 


427 


TICTN1CFNNLDEIK/FLERHKLSKLTQEEVENL 
ITLKTSRETEL VINK* VIPHKEKPGPDSFTGEF 
YQTF KEEL/I I/ILHKLFQTIKYGRJLPNSVYETS1 
TLKPKPEKDLMCENYRPLPLSNIDAKNLNKTLA 
NRI**HIR 


29 


1379 


A 


434 


395 


128 


1 YSKMCMERQRLNN* 1LKKNK.V RGIAVPD VK 
VYYKPTVIICTS WIL ♦ KDSHIVE WNRLENLEID 
PN/IKRLILDKGAEATEWRKDSFFRQWQ 


30 


1380 


A 


455 


2 


228 


FFFETESHSVTQAGVQWCNPGFKRFSCFGLSS 
SWDYRYAPPRP\ANF\*FLVETGFYYVAQAGL 
KJLLSPGDLPALAS 


31 


1381 


A 


462 


393 


2 


QLMFDKGVKNIH\WGWTPPFTK*YWKNWISI 
CRRMNLNP YL S R YTK IN S RUCDLT V RP EPI KL V 
EENTGKTIQDTGLGK.*F1AKTSKAQSTKTNK* 
KRQTRYIKLKVKKSTASKENNR VKRQPLE* FK 
1FAN 


32 


1382 


A 


474 


125 


471 


VKPYEIAVFLVKPIEYX*HLLSDPAIPLSGI*LK 
EIKAYT/RRICTPMFAAPVSV WRN* KQSK7CQ 
KQ*YVHRMEYYTTIKRSEILICTTTWVDFRNT 
ILRETDRII IKTTYDV1SLI 


33 


1383 


A 


488 


1825 


2 


KSACSFICSEEQPASPSFLKPGTYASETVRPRDP 

HAAGPRRDSSEAETRRPRGATXiSGTVVKGT 

PGSPAPPCS WGHGG\ETEGAG* CPAAPGTDLR 

APGGSAGS*\GLPSAGGSRGRKGWRAAGRQP 

STR*GRPGRHGGRGE*AGHPEPRQSALQSAG 

L/ASSPEPMGAALAEDGSGDSRGAGPRPQE*P 

PSVLSRS\GS*G*G*AASGTASSPRSHSSRLGPP 

SAGFHGLRCGQPPFAAAPPGPWPGTGRPAGG 

AGSPP AAAGT APP ATRGAQ SRRQNRT AGRNA 

SPQTAAGAGSPVQWALSRATG*TGETGSWC 

AGGTHQATHLTAA W VCPPTWS VRPGG SGPA 

AGLGR*GRHPAQSPPLPVPRG*PAWPQEAPSP 

SPASSEVALSSGSCWPDQAPGPARGSPPAPLA 

PAWPAAGRGRQR* GRQS AHPPPRR* STA VSL 

SGTS* WRRSP* AGTRTQQC*SPWLVPACSSRP 

L*RGTRRPSTQQSPQTTGTPGRSAGPGHPRS* 

GGRSPAGTGHLGAQTVASPH*GHWPTALSCL 

WASASPPGPEAPPQTGACIGTNCRYRAASAR 

RSSVAPACA*GWQ* AGSPPAVLRGPP* RVRER 

GALTHRPRAPDE 


34 


1384 


A 


497 


422 


2 


APGASVGRAQAAEG*RGGPTGRPPSALGVS/E 

AGRAGRAGEGRPVPPAYPLCKSAQTSGPPKA 

RLSXPPLASCGGRGPPGGAACATCAPPAGPAR 

SSRCRRRSPPE*GPR*PSRPARPSPGSAASRRQ 

KLTPCRCQFRGLCA 


35 


1385 


A 


509 


\56 


475 


PTPYPGE*QAAFLLRGPGLRPPA/DPSLR/HRN 
LTELWAVTDEMVGLFAALLAERRVLLTAS 
KLSTLTSCDHAFCALLYPMRWEHVLIPTLPPH 
LLDYC* CPPLPRT 


36 


1386 


A 


512 


3 


1631 


FFFSFVCHL YCVSP 1 PuPHGRLATWL/PGLL A 
FLGLAAGGQTLCPAGELPGHARAQASGAPGS 
VLIA VPGRRRV1 ITCGPGPAAPSTRGECPPPAL 
GHTRPARPRPV\PFAPAVPQEPGGQGHGAA/P 
PATGHS APRGCPPARAAPTG S ATPAPPPAACA 
AFHS A WS VPPAGRQQG * RVP AP AFRRTTPGT 
PGQHLLDRPGAPPAQGSGPAPAPPPRLAGPA 
GPAAPPPGPPAAS WHSSLSKS S SSL\G WSPPLP 
VGPGSLQ*TPPPQGPIILSGSCGGTSSWRGQR 
AAVARRLRSWNACGLSRVAGRSSASYPGRE 
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Amino acid sequence (A - " Alanine C=Cysteine, 
D-Aspartic Acid, E-GIutamic Acid, 
F=Phenylalanine, G=Glycine, H=Hisiidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=MetMonine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, V/=Tryptophan, 
Y-Tyrosine, X-Unknown, *=-Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














GRPSQSQ*PAGPPGMRGCCLRGW*PSSSGSD 
GPGPHPASTWLRAGKTGPSPPACGCA*LPPPS 
VSAAPQSPRTRCPRGCAAAAGLCVLAAAGAS 
HGA\GLPGVRVHTQRVHIH*GAG/GCQTPRPR 
LRSLPVLGLPAPRCP V S AHP WHRRS GS SCHA 
ARL VPRHP APG CP* *TG *\PLITGFPEP* A* GLP 
NHQAVGLEASGALQAGHRDELPTMVQLLDH 
SPDYPLKGRPHAP 


37 


1387 


A 


620 


828 


1 


FRLPLAAGA/RGAAEPRVAVSMAPDPSAKIH 

WEASPEMQSKCHQKGKNNQTECFNHVRFLQ 

RLNSTHLYACGTHAFQPLCAAIDAEAFTLPTS 

FEEGKEKCPYDPARGFTGLIIDGGLYTATRYE 

FRSIPDIRRSRHPHSLRTEETPMHWLNG*EDE 

AQDDG G * GTI S SFLLP WPADHPTPKSPGEP VH 

SIPVCCQVRGQPQSGGKESPACLKSLSNCLTH 

\DAEFVFSVLVRESKASAVGDDDKVYYFFTE 

RATEKESGSFTQSRSSHRVARGIPPL 


38 


1388 


A 


739 


1 


427 


FRAMVSSTLKLGISILNGGNAEVQ/QGNRGK.G 
TSEEGKEG * EVP V*LPV SPPLPRPLQKMLD YL 
KDKKEVGFFQSIQALMQTCNGEKVMADDEFT 
QDLFRFLQLLCEOHNNDFQNYLRTQTGNTTT 
INIIICTVDYLLRLQESI 


39 


1389 


A 


767 


I 


1030 


TLDLTGPLLLGGVPNVPKDFRGRNRQFGGCM 

RNLSVDGKNVDMAGFIANNGTREGCAARRN 

FCDGRRRQNGGTCVNRWNMYLCECPLRFGG 

KNCEQGEWPASSIPPVTAAWEALLLDVPGTT 

VRGLHIQVRQPLWYAAFTVDSHRPLQETVL 

RRAPAPASGVPSPSGVGWDR*AGPAEPSPSTP 

AI'VHSVPWYLGLMFRTRVKEDSVLMEATSGG 

PTSFRLQVTGAPCHQGTC+VGARGRDPMLSG 

LRVTIX?E\VHHLUELKNVKEDSEMKHLVTM 

TLDYGMDQVSWHLHLLWG*TLPPAQGKTGA 

SEDK VS VRRGFRG CMQVRGGCGGRGEACPS 

QAAPRL 


40 


1390 


A 


801 


69 


399 


IHKIIIHKEDLNKWKYILCSGMERLSTVMrPVV 
PQIIYKFN A* Q WILKFTW* E* G AKITILRKNKL 
RGLVLVPLSTC*VKYLLDKVLPHIKTYYEAR 
VNKSWLVQVTIM 


41 


1391 


A 


835 


7 


195 


SMLKERKVFQFPSCLFFQYITWLGPPYHVLFD 
SSVTNFSIGAK*D1LQSVMNCLYAKRIPCVT 


42 


1392 


A 


841 


1 


415 


G STHASGYDKTPDFILQ VPVA VEGHIIHWIES 

KASFGDECSHHAYLHDQFWSYWNSLKHRTW 

QGIGTVASNLSQL*TLNAPFPELLLFRSLARTG 

FVLT*\RFGPGLVIYWYGFIQELDCNRERGILL 

KACFPTNIVTL 


43 


1393 


A 


845 


358 


92 


PALSPAPVPQKKGSPLPLDPCLGPSSWLLSVG 
LGWPRL* PRRGPGDPG SLPATPPLLTPPHTLLP 
QRPMLPPSHAGLARPPPPEPISVP 


44 


1394 


A 


853 


452 


1 


LPQYCFFPRLSPKSKLVKHS AL* ♦PSALKPPTK 

SPRCIPRTSLYFTICC/PPALQL/SPIEDPPAIYRS 

PPTHMLRSASQPLNQAPTLVKGHPPSRFLQG 

QVSCPPQPTLPREKPLPLHLRPPPRPAQPPLPR 

PLTFSTRRNVDPEIPERFR 


45 


1395 


A 


894 


379 


162 


GVYPPTVFDNYSVQTSVDGQIVSLNTWDTAG 
QEEYD/RLRTLS*PQTSIFV1CFSIGNLEFPIYGT 
WLSMSMGK 


46 


1396 


A 


900 


1 


366 


TTKKTL1SNNVS S P^U'ILPFXKAFSLAFNDPL 
E IQKYMRT/DQ * CVTHDISLYIVTKLALIFLIPR 
VFLFHQLNIT* * CLHFFTMTTFIAIPFSFLFLGR 
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Amino acid sequence (A=Alanine C=Cysteine, 
D=-Aspartic Acid, E-Glutamic Acid, 
F=PhcjQylalaninc, G^Glycine, H=Hisn'dine, 
I=Isoleucine, K=Lysine, L=Lcucine, 
M=Methionine, N-Asparagine, P=Proline, 
QKJIutamine, R^Arginine, S=Serine, 
T=Thieonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, \-possible 
nucleotide insertion 














D/KSL AMLPRLV SN S WPQ VILPP 


47 


1397 


A 


944 


162 


2 


QLQNLASRGCL* SQLLRRLRRENRLNPGGGG 
CSEIAP\CTPAWVTQRDFFRKKK 


48 


1398 


A 


963 


216 


308 


HFTPDRIAJVXNTRDSHCWRGC*EEGAPARC 


49 


1399 


A 


967 


466 


1 


PRKRES WW GERLP/PRGFPP AAED APAPGWK 
GRKHASRTARAHVFHPIRQSIRSPVRGRPGDP 
RAAHTRSAGTRLQCKASRGG*GKGPAPTR*E 
GGPG S AP APLP ASSGC SLFPDS SPWTPPPPAPG 
AAAAQP**TPRCPAALRAGAHIGRVGRPY 


50 


MOO 


A 


973 


45 


421 


EKCIQALDVFVFC YIDHSSHCLMSCD* E/DQA 
LNFMPLEMEPKMSKLAFGCQRSSTSDDDSGC 
ALEEYAWVPPGLRPEQIQLYFACLPEEKVPY 
VNSPGEKHRIKQLLYQLPPI IDNEVRYCQSLSE 
E 


51 


1401 


A 


992 


2095 


194 


IRIRHEAARSCLGCAAGHVPAPGLRLLPTVRG 

PPGRRGPAAPGCVCY* SGESTFVSH VPQRMA 

WPGSAPPRGFHPLQSQTSPSDTVSSPQLSKEE 

DGPGWEHPLSSSL*SLGQAGGNH*QPEELAG 

WEPRGPPSLAPSSPT/TNfWTALVLrWIFSLSLS 

ESHAASKDPRNFVPNKMWKGLVKRNASVET 

VDNKTSEDVTMAAASP VTLTK GTS AAHLNS 

MEVTTEDTSRTDVSEPATSGVAADGVTSIAPT 

AVASSTTAASITTAASSMTVASSAPTTAASST 

TVASIAPTTAASSMTAASSTPMTLALPAPTST 

STGRTPSTTATGHPSLSTALAQVPKSSALPRT 

ATLATLATRAQTVATTANTSSPMSTRPSPSKH 

MPSDTAASPVPPMRPQAQGP1SQVSVDQPW 

NTTNKSTPMPSNTTPEPAPTPTVVTTTKAQAR 

EPTASPVPVPHTSPIPEMEAMSPTTQPSPMPYT 

QRAAGPGTSQAPEQVETEATPGTDSTGPTPRS 

SGGTKMPATD SCQPSTQGQ YMV/DHH* APHP 

GRGRQN SPSGO A VTRGDPFHHSLGFVCP AGL 

♦ELQF.EGLHPGGLLNQRDVCGLRNVRGAGA 

WRE A WPLPRPFLLPLRPNQVLPN SFGAIEEIC 

QMLKHI 


52 


1402 


A 


994 


1 


462 


ESGEFLVSFTLKJKFTNVFHHrNGMKFFNK/LIF 
*SHTDIAFYK1QHPFMLKALTKWA*EGT*PDR 
RYLH*SLRLNGEQLKTFPLRSGMR*G/CA1LPL 
VLNAMLSIVPAVVPAGKTRHEKEITCPLIGQE 
EK*FS*FVGDMNTCVENKKESKJCLLE 


53 


1403 


A 


1011 


1 


630 


PEVIQQSAYDSKADIWSLGITA1ELAKGEPPNS 

DMHPMRVLFLIPKNNPPTHCWRRLLESFKEV 

" L MLA* TKD P SIXRPT AKELLKHKFI VKN SKKT 

SYLTELIDRFKRWKAEGHSDDESDSEGSDSES 

TSRENNTHPEWSFTTVRXXPDPKKVQNGAEQ 

DLVQTLSCLSMnTPAFAELKQQDENNASRNQ 

AIEELEKSIAVAEAAGPG 


54 


1404 


A 


1016 


1 


222 


IS IDA* KAFDKIQH/CFMITTLKKLGIDGKYLN 
TIKAIDDRHTVSTrLNVFKLK AFL* RSGTRQRF 
PISGSGARI 


55 


1405 


A 


1033 


3 


366 


HASVDGDEGSDDVYYYYTPAILRELQALNTA 
EAAEHRPEEDRMLSEDPWRPAI IMIKGYMPL 
HKIPHTE VTD VTG LN QS HL Y QHLNK GTPMKT 
QKRAAVLYTWHVLEQLEILRQINQQSHGPG 


56 


1406 


A 


1044 


5 


429 


SVLTLQTRSPSICPLS\RiCLMDWEVVSRNSISE 

DRLETQSRASRSPPVTPNQSQETPVDGKPLAL 

PPNQSQKNIRYHIHYLHLQYYLDRHISATLPIP 

SSSGIPTPIAV1TDALTDLVELILGQPCSEESGR 

APGTLFLLAL 
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seq- 
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SEQ ID 
NO: of 
peptide 
seq- 
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hod 


SfcQ 
ID NO: 
in 
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09/496 

Q1 A 


it real clou 

beginning 

nucleotide 

location 

correspond! 

amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

of peptide 
sequence 


D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=<jlycine, H=Histidine, 
I-Isolcucine, K-Lysinc, L-Leucine, 
M=Methionine, N=AsparagLne, P=Prolinc, 
r^=Gliitamine R= ArfHnine S=Serine 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


57 


1407 


A 


1050 


U 


430 


G A Y AF ETNG FP IML V LTTD Kl EG D V G IAG L YD 

\fH\ISLPMAFLLRTLVRCTSYIIP\THVLSTPV 

TCLRRREKDGVIVDVLSDTASNHNGFPVEEH 

ADDTHPARLQGPTLRSQPMGPLKHKAFEERA 

NLGLVQRRLRLED 




IfUB 


A 
/\ 


l u jo 




419 


LKHRDTPVVGANNRALSCTPLTSLTLCALCPL 
PCLGCPTXATCRLYQTTVAVVF 


59 


1409 


A 


1064 


3 


425 


KAFSFTTSLIGHQRMHTGERPYKCKECGKTF 
KGSSSLNNHQR1HTGEKPYKCNECGRAFSQC 
SSLIQHHRIHTGEKPYECTQCGKAFTSISRLSR 
rwpnrrnPVPPT-TrisrFPGiirvFSYH^Ai iihorth 

nni\ in i vJCiAX r n i tj v VJIV t 1 O i n JAIjI i 1 1 v<JIvJl J. x 

TGEKPYACKDVGK 


60 


1410 


A 


1065 


204 


419 


GG PPGPFL AHTHAGLQ APGPLL APAGDEGDL 
LLLAVQQSCLADHLLTASWGGK/DPrPTFCALG 
EGQEGLPLTV 


61 


1411 


A 


1079 


3 


383 


RH SRAHLCQPFHL VMRDLLQLGQD IPQGCHY 

LEENHLIHRDIAARNCLLSCAAPTRAATIGDF 

GMARYIYRTRYYQLGDRAL/LPRKWMPPEAJL 

LEG1FTYNTDSWTFGVLLWEIFSLGYMPYPGR 

TO 


62 


1412 


A 


1080 


1 


859 


V Y r, r L W b KKr a O a a ur KKKKr A b KtyMMttK 
ANLMHMMKLSIKVLLQSALSLGRSLDADHA 
PLQQFFWMEHCLKHGLKVKKSFIGQNKSFF 
GPLELVEKLCPEASDIATSVRNLPELKTAVGR 
GRAWLYLALMQKKLADYLKVLIDNKHLLSE 
F YEPEALMMEEEG MVIV G LL V GLNVLD ANL\ 
CLKGEDLDSQ VG VIDFSLYLKD VQDLDG GKE 
tico TTri\/i nnvMWFPi "mpi-ti ■vrTvrtTii OTkT 

IDGLEKTNSKXX^ERVSAATDRICSLQEEQQQL 
REQNELIR 


63 


1413 


A 


1083 


2 


615 


SSFAKHKRIHTGEKPFICLECGKAFTSSTTLTK 

HRRIHTGEKPYTCEECGKAFRQSAILYVHRR1 

HTGEKPYTCGECGKTFRQSANLYAHKKIHTG 

FKPYTCGDCGKTFRQSANLYAHKKJHTGXEKP 

YKCKECGKAFKSYYSILKHKRTHTRGMSYEG 

DEC/QRSLN/RS SIL SNHKI1HNEEK/PLKCEKCE 

KAFNHTSICCRHKKN 


64 


1414 


A 


1084 




1 


SSSQSPEAIKQLLDSGLPSLLVRSLASFCFSHIS 

SSESIAQSIDISQDKLRRHHVPQQCNKMPITAD 

LVAPILRPLTEVGNSH1MKDWLGGSEVNPLW 

TALLFLLCHSGSTSGS\HNLGNAQQDQCKISFS 

FFSWLTTGLTTQQRTA1E\NATVAFF\LQCI\SC 

HPNNQKLMAQVLCELFQTSPQRGNLPTSGNT 

S\GFIR\RLFLQLMLEDEKVTMFLQSPCPLYKG 

RINATSHVIQHFAMYGAGHKFRTLFILPVSTTL 

SD VLDR VSDTPSITAXLI SKQKDDKKKK 


65 


1415 


A 


1087 


103 


324 


PRAFEFVHTEMIVG/RVQNIHLFTLQVLEDRA 
LFTMSVGSSLWSTY'LIHVMALP/DRELLKPNA 
SVALHKLSNALV 


66 


1416 


A 


1095 


3 


493 


HETC S VTHI V S FS LPFLNP S HP ASTPGHTENE Q 

PSLVAVFDRGKFYLTFEGSSRGPSPLTMGAQD 

TLPVAAAFTETVNAYFKGADPSKCIVK1TGE 

MVLSFPAG1TRHFANNPSPAALTFRV1NFSRLE 

HVEPNPQLLCCDNTQNDANTKXEFWVNMPNL 

MTHLK 


67 


1417 


A 


1098 


57 


356 


LKLT SLGFII GV S V VGNLLISILLVKDKTLFTRA 
PYYFLLDLCCSDILRSAICFPFVFNSVKNGST 
WTYGTLTCKVTAFLG VLRCF] ITAFMLFCIS VT 
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NO: of 
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seq- 
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SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 
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ID NO: 
in 
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914 
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jeginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AJaninc C— Cysteine, 
D=*Aspartk Acid, E-Glutamic Acid, 
F=Phertylalanine, G=Glycine, H=Histidine, 
I^Isoteucine, K^=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=Glutamme, R=Arginine, S=Serine, 
T-Threonine, V- Valine, W-Tryptophan, 
Y-Tyrosinc, X-Unknovm, *-Stop codoo, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 
RYL 


68 


1418 


A 


1106 


1 


1326 


MGK1SATGINMGTKCSWALVWHLESYL3PKH 

YEREGMQDWKTASGQSEEATQQSSQKPQPH 

YTTYQSSSFLKYSSESHLLAWRENSSEGSFQF 

PGRSRARPPRTRQQRRGAAAGPGRGAVRLG 

HPQSAAQPQLRAAARIPESPAAFPAQPRPGSA 

RNSDASGPASLSRTLGRASSPRPPQAPDVTAP 

SPAALAPRAARGGSRAAALAGAEAEEPLRTL 

APRPTRAAAPPPPPPPPPLPPGAPPPPVRCVSR 

RARAPPWR/PAATGPPPXRPVAPSRKLGSARAP 

APALQ1RKGTSSGLPGRGGGSGPGNNLSSVA 

GNWRGSSFAVERPGMAKYQGEVQSLKLDDD 

SVIEGVSDQVLVAWVSFAL1ATLVYALFRNV 

HQNIHPENQELVRVLREQLQTEQDAPAATRQ 

QFYTDMYCP1CLHQASFPVETNCGHLFCGSLT 

PNSIW 


69 


1419 


A. 


1107 


2 


466 


FDTARLHEFGTS1TQIFAVDNREDLQKWMEA 

FWQHFFDLSQWKHCCEELMKIE1MSPRKPPLF 

LTKEATSVYHDMSIDSPMKLESLTDIIQKKIEE 

TNGQFLIGQREESLP/SS/CGPHSLMVTIKWSS 

RKRY/SYPASEPLHDEKGKKRQAPLPPSDK 


70 


1420 


A 


nil 


698 


23 


ALRRLHYVRATKV\FLSFRRPFWREEHIEGGH 

SNTDRPSRMIFYPPPREGALLLASYTWSDAAA 

AFAGLSREEALRLALDDVAALHGPWRQLW 

DGTGVVKR W AEDQH SQGGFVVQPPAL WQT 

EKDDWTVPYGRrYFAGEHTAYPHGWVETAV 

KSALRAAIKINSRKGPASDTASPEGHASDMEG 

QGHVHGVASSPSHDLAKEEGSHPPVQGQLSL 

ONTTHTRTSH 


71 


1421 


A 


1119 


2 


385 


QKQTLQNGYLDSSMDILYLGSLPPELQVSSDE 
PPGPPEQAGLSQFHLEPETQKPETTEEIQSSVLQ 
QEAAAQLPQLPEVVELS STKAXEAPALPSQSL 
EGVHSSTEQKAPAQQLPAFEE1LAPLL1HHE 


72 


1422 


A 


1127 


1 


906 


HAQ YVGPYRLEKTLGKGQTGLVKLGVHUl i 

GQKVAIKIVNREKLSESVLMKVEREIAILVRLI 

EliPFTVlJCLHG VYENKKYFPPDELTSGPS ML A 

QVSPHGKLSARRSWDLLSGFPRYXVLEHVSG 

GELFDYLVKKGRLTPKEARKFFRQIVSALDFC 

HSYSICHRDLKPENLLLDEKNN1R1ADFGMAS 

LQVGDSLLETSCGSPHYACPEVIKGEKYDGR 

RADMWSCGVILFALLVGALPFDDDNLRQLLE 

KVKRGVFHMPHFrPPDCQSLLRGMIEVEPEKR 

LSLEQIQKHPWYLGGNFIS 


73 


1423 


A 


1128 


I 


802 


LRNALDVLHREVPRVLVNLVDFLNPTIMRQV 

FLGNPDKCPVQQA/MLEPLGSKTETLDLRAE 

MPrTCPTQNEPFLRTPRKSNYTYPIKPAlENWG 

SDFLCTEWKASNSVPTSVHQLRPAD1KWAA 

LGDSLTTAVGARPNNSSDLPTSWRGLSWSIG 

GDGNLETrnTLPNTLKKFNPYLLGFSTSTWEG 

TAGLNVAAEGARARDMPAQAWDLVERMKN 

SPDINLEKDWKLVTLFIGGNDLCHYCENPEA 

HLATEYVQHIQQALDILSE 


74 


1424 


A 


1139 


60 


480 


FREPCLLVPGDHQPLREASWLA/LPPIGLWGT 
DSPLCCVEVA1PCNKGAHSVGLKGWLLAQG 
VLGMRDTTPQEHPWESTPDLCFCRDPEEIEVE 
EQPAADAAVAKGEF/QGEQIAPVPAU1AAHPE 
AADP APVHTTAI 1PKGA 


75 


1425 


A 


1147 


2 


413 


" PFPHQHPQEP\KGSCWPQSALRGQCPGPVLGV 
TTTSDLCSLQVPVSSHRNPLLDLAAYDQEGR 
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seq- 
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nucleotide 
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correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
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location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
CWVspartic Acid, E— Glutamic Acid, 
F=Phenylalanine, G=Glycinc, H^Histidinc, 
I=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y-Tyrosinc, X=Unkno\vn, *-=Stop codon, 
/—possible nucleotide deletion, \=possib!e 
nucleotide insertion 














RFDNFSSLSIQ WESTI^VLASIEPELrMQLVSQ 

DDESGQKXLHGLQAILVHEASGTTAITATAT 

GYQESHLSSAR 


76 


1426 


A 


1155 


38 


410 


PIISAPAQDDPLLLSFIHCLHANLLCVWRRDVK 

PDCKEIW1FWWGDEPNLVWQYTMNCMLWK 

KJ)SGKMAFPMNVGRC7FFKErHNLLERCLMD 

KNFVLIGKWFVRPYYKDEKPVNKSEHLSCAF 

T 


77 


1427 


A 


1162 


526 


350 


RFPQGLEDVSTYPVLIEELLSRGWSEEELQGV 
LRGNLLRVFRQVEKVQEENKWQSPLED 


78 


1428 


A 


1171 


I 


1293 


MAES ASPPSS S AAAP AAEPG VTTEQPGPRSPP 

S SPPGLEEPLDG ADPH VPHPDL APIAFFCLRQT 

TSPRNWCIKMVCNPWFECVSMLVILLNCVTL 

GMYQPCDDMDCLSDRCKILQVFDDFIFIFFA 

MEMVLKMVALGIFGKKCYLGDTWNRLDFFI 

VMAGMV'EYSLDLQNINLSAIRTVRVLRPLKA 

INRVPSMRILVNLLLDTLPMLGNVLLLCFFVF 

FIFGIIGVQLWAGLLRNRCFLEENFTIQGDVAL 

PP\YYQPEEDDEMPFICSLSGDNGIMGCHETPP 

LKEQGRECCLSKDDVYDFGAERQDLNASGL 

CVNWNRYYNVCRTGSANPHKGAINFDNIGY 

A WIVIFQ VITLE G W VEIMYYVMD AHSF YNFI 

YFILLII VS VREPGLLGGSFSTAQSPKCQGDSFP 

GVAAESLLLRGWVLWLPGGG 


79 


1429 


A 


1175 


1 


405 


PNDFFKDMFPDLPGGPLGPIKAENDYGAYLN 
FLSATHLGGLFPPWPLVEERKLKPKASQQCPI 
CHKVIMGAGKLPRHMRTHTGEKPYMCTICE 
VRFTRQDKLKIHMRKHTGERPYLCIHCNAKF 
VHNYDLKNHMR 


80 


1430 


A 


1182 


25 


198 


EMNELSQQLSQQGGRGASQCPSPPAPTLPNPT 
PLCQLQLQRVNTGLPTPPCHPGAGAA 


81 


1431 


A 


1186 


254 


583 


KTVLDVGAGTGILS1FCAQAGARRVYAVEAS 
AIWQQAREWRFNGLEDRVHVLPGPVETVEL 
PEQVDAJVSEWMGYGLLHESMLSSVLHARTK 
VVKDGGFFLPXS SELFM 


82 


1432 


A 


1187 


2 


716 


DFVDAARNLPLESTKSPAEPSKSVPSLENDPRA 

SSQGLPSQGPVQNQGRRGEQRPKKF/TVIQHT 

SSFEKSDSLEQPSGLEGEDKPLAQFPSPPPAPH 

GRSAHSLGPKLVRQPNIQVPEILVTEEPDRPD 

TEPEPPPKEPEK.TEEFQWPQGSQTLAQFPVEK 

LPPKKKRLGLAKMAQSSGESSFESSVPLFRSP 

SQESNVSLSGSSRSALFERDDHOKAEAPSPSF 

DMGPKPLGTHMLTV 


83 


1433 


A 


1188 


517 


804 


ESPGLSKVLRTGAF AYPFLFDNLPLF YRLGLC 
WGRGHGCGQEALSTSHGYHLFCALLTGFLFA 
SHLPERLAPGRFDYIGHSHQLFHICAVLGTHF 

0 


84 


1434 


A 


1192 


45 


476 


LGDVGFWVERTPVHEAAQRGESLQLQQLEES 
GACVNQVTVDSITPLHAASLQGQARCVQLLL 
AAGAl^ VL>AKNllXji) 1 PLLbOJL^KLOQHK VCliA 
LAVLRGQGQPSPVHSVPPARGLHXREFRMC* 
GFLFDVGXNLEAHEFHFGEP 


85 


1435 


A 


1194 


69 


410 


KRSEEASAPPFPLGGTGAAPTRASLPEQILLPR 
SCLEARKSQPDEKLLSALHNSRTWN*EPRRSQ 
HRLVSPEVHPGRRGSSPGVAECKLTSAYFRT 
GRSPCPSLPGTTRTNSLL 


86 


1436 


A 


1215 


3 


405 


LPSHTCGNPGRLPNGIQQGSTFNLGDKVRYSC 
NLGFFLEGHA VLTCHAGSENSATWDFPLPS C 
RADDACGGTLRG/AEWHHLQPPLPLG/ATKN | 
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Amino acid sequence (A= Alanine O-Cysteine, 
D=*Aspartic Acid, &=Glutamic Acid, 
F=Phcnylalanine, 0=Grycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=ProIine, 
Q=G hit amine, R=Arginine, Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y^Tyrosine, X-Unknown, *-Stop codon, 
/—possible nucleotide deletion, \=possible 
nucleotide insertion 














N AJDCT WTILAELGDT1ALVFIDF QLEDG YDFL 
EVTGTEGSSLW 


87 


1437 


A 


1216 


226 


964 


GTARFGPMVGFGANRRAGRLPSLVLGVLLV 

VIWL AFNY WS IS SRHVLLQEE V AELQGQVQ 

RTEVARGRLEKRNSDLFAWGHAQETDRPEG 

GRLRPPQQPAAGQRGPREEMVEDDKVKLQNN 

ISYQMADIHHLKEQLAELRQEFLRQEDQLQD 

YRKNNTYLVKRLEYESFQCGQQMKELRAQH 

EENIKKLADQFLEEQKQETQKJQSNDGKELDI 

NNQWPKNTPKVAENVADKNEEPSSNHIPHG 


88 


1438 


A 


1218 


1 


534 


PEFGTTISCGYLMATDVSRRPSVHKAVEIEQE 
RVFCSAGAWIIHPYSDFRFYWDLIMLLLMVGN 
L IVLP VGITF FKE EN SP\P WTV FNVL S DTFFLLD 
LVLNFRTGIWEEGAEILLAPRAIRTRYLRTW 
FLVDLISSIPVDYIFLVVELEPRLDAEVYKTAR 
ALRJVRFTKILSLLRL 


89 


1439 


A 


1223 


1 


743 


MGFDEVFMINLRRRQDRRERMLRALQAQEIE 
CRL VEA VJXJKV GMLTRSN AAPGRFDL AMLET 
L VWAPRF VD ADNL1LNPDTL S LL I AENKT W 
APMLDSRAAYSNFWCGMTSQGYYKRTPAYI 
PIRKRDRRGCFAVPMVHSTFLIDLRKAASRNL 
\AFYPPHPDYTWSFDDIIVFAFSCKQ\AEVQMY 
VCNKEEYGFLPVPLRAHSTLQDEAESFMHVQ 
LEVMVPSSPSSAQSMAVVSADHIGLVISYL 


90 


1440 


A 


1227 


2 


349 


NKTSFJFYLKNIWADLIMTLTFPFRIVHDAGF 
GPWDFKFILCRYTSVLFYANMDTSIWLGLIT/ 
YDRY/WKVVRHL/WDSWMTGI/SFTRVYLLG 
LGARLVWFGKLILAKGGHGGISWL 


91 


1441 


A 


1245 


3 


1937 


LGSSDVRAPQRSELGAESPSRMVASQAYNLT 

SALTPILTRSRVLNEEPLTLAGFNSRAPANLSD 

WQLIFL VDSNPFPFG YI SNYTVSTKVASMAF 

QTQAGAQIPIERLASERAITVKVPNNSDWAAR 

GHRSS AN S V\ VQPQ AF VG AV VTLDS SNPAAV 

LHLQLNYTLLDGRYLSEEPEPYLAVYLHSEPR 

PNEHNCSASRRIRPESLQGADHRPYTFFISPGT 

RDPVGSYRLNLS S HFRWS ALE VS VGL YTSLC 

QYFSEEDWWRTEGLLPLEETSPRQAVCLTR 

1 ILTAFGTSLrVPPSHIRF VFPEPTAD VNYIVML 

TCAVCLVTYMVMAAILHKLDQLDASRGRAIP 

FCGQRGRFKYEIL VKTG WGRG SGTTAHVGIM 

LYGVDSRSGHRHLDGDRAFHRNSLDIFQIATP 

HSLGSMWKIRVWHDNKGLSPAWFLQHIIVRD 

LQTARSTFFLVNDWLSVETEANGGLVEKEVL 

AASKASFRVPTPSNAALLRFRRLLVAELQRGF 

FDKHIWLSIWDRPPRSCFTRIQRATCCVLLICL 

FLGANAVWYGAVGDSAYSTGRVSRLNPLSV 

DTVAVGLVSSVVVYPVYLAILFLFRMSRSKV 

GWGWGPGSTGNGAWASAPCPEPPLSSAAAR 

GKGVHQR1XGKGQHT 


92 


1442 


A 


1246 


5 


562 


VFDfcENILNELNDPLRiihlVNrNLRKl.VATMP 
LFANADPNF VTAML SKLRFE VFQPGDYIIREG 
A VGKKMYFIQHG VAG VTTK SSKEMKLTDGS 
YFGEICLLTKGRRTASVRADTYCRLYSLSVD 
NFNEVLEEYPMMRRAFETVAIDRLDRIGKKN 
SILLQKPQKDLNTaVFNNQENBILKQIVKH 


93 


1443 


A 


1249 


180 


901 


TVPPPPGGPSPAPLHPKRSPTSTGEAELKEERL 
PGRKASCSTAGSGSRGLPPVSSPMVSSAHNPN 
KAEIPERRKDSTSTPNNLPPSMMTRRNTYVCT 
ERPGAERPSLLPNGKENSSGTPRVPPASPSSHS 
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amino acid 
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of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D— Aspartic Acid, E-G)utamic Acid, 
F=PhenyIalaninc, G -Glycine, H^Histidine, 
I=Isoleucinc, K— Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine t W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=*Stop codon, 
/-possible nucleotide deletion, Wpossible 
nucleotide insertion 












LAPPSGERSRLARGSTIRSTFHGGQVRDRRAG 
GGGGGGVQNGPPASPTLAHEAAPLPAGRPRP 
TTNLFTKLTSKLTRRVADEPERJGGPEVTRRP 
RQEDHLSPGGRGCSEL 


94 


1444 


A 


1261 


3 


385 


KFSQWGLTKPKLSNASP/W1SLVKKLMKKWS 

VTQNLTFREQLEAGIRYFDLRVSSKPGDADQ 

EIYFIHGLFGIKVWDGLMEIDSFLTQHPQEIIFL 

DFNHFYAMDETHHKCLVLRIQEAFGNKLCPA 

CR 


95 


1445 


A 


1282 


2 


550 


GPRDNPGVBDPRFEIVEHFGIAWFTFELVARFA 
VAPDFLKFFKNALNLIDLMS1VPFYITLVVNL 
WE STPTLANLGRVAQ VLRLMRJFRI LKL ARH 
STGLRSLGATLKYSVKEVGLLLLYLSVGISIFS 
VV A YTIEKEENVEGL ATTPAC W WWATVS M TT 
VGYGDWPGTTAGKLTASACILA 


96 


1446 


A 


1294 


1 


1456 


QLLPPSNRENAGLLVGRCLCSAALRPVGDLIT 
SSGQVAVRNAPQAGSAKAGKGKFQDNFEFIQ 
YFKKFFDANCNEKD YNP VAAGQGQETE VAP 
S 1 V AP VLNKPNQC PEG Yl C V KAGRNPNY G YT 
SFDTFSWAFLSLFRLMTQDYWENLYQLTLRA 
AETTYMIF/LV/L VILLGSLYL VTL1LAV/V AMA 
YEEQNQATLEEAEQKEAEFQQMLEQLKKQQ 
EAAQQ AATAT ASEHSREPS AAGRLSD S S SEAS 
KLS SKSAKERRNRRKXRKQKEQSGGEEKDED 
EFQKSESEDSIRRKGFRFSIEGNRLTYEKRYSS 
PHQSLLSIRGSLFSPRRNSRTSLFSFRGRAKDV 
GSENX>F ADDEHSTFEDNE SRRD SLF VPRJRHGE 
RRNSNLSQTSRSSRMLAVFPANGKMHSTVDC 
NGWSLVGGPSVPTSPVGQLLPEVIIUKPA IU 
DNGTTTETEMRKRRSSSFHVSMDFLEDPSQR 
QRAM S1A SILTNT VE 


97 


1447 


A 


1295 


2 


2057 


IQTQLPTKSS QQ LRKGGNCVRCKMQMNFIAE 

EVLLKYRn-FYNNNKGPNMLYlEIKAFVHFMI 

NRYLSYGSGPKRFPLVDVLQYALEFASSKPV 

CTSP VDDIDA S SPPSGSIPS QTLPSTTEQQGALS 

SELPSTSPSSVAAISSRSVIHKPFTQSRIPPDLP 

MHPAPRHITEEELSVLESCLHRWRTE1ENDTR 

DLQESISRIHRTIELMYSDKSMIQVPYRLHAV 

LVHEGQANAGHYWAYIFDHRESRWMKYNDI 

AVTKSSWEELVRDSFGGYRNASAYCLMYrN 

DKAQFLIQEXDLIKTGQPLVGIETLPPDLRDFV 

EEDNQRFEKELEEWDAQLAQKALQEKLLAS 

QKLRESETS VTTAQAAGDPK YLEQPSR SDFSK 

HLKEETIQIITKASHEHEDKSPETVLQSArKLE 

YARLVKLAQEDTPPETDYRLHHVVVYFIQNQ 

APKK1IEKTLLEQFGDRNLSFDERCHNIMKVA 

QAKLEMIKPEE VNLEE YEEWH QD YRKFRETT 

MYLHGLEnFQRES Y1D SLLFLIC AYQNNKELL 

SKGLYRGHDEELISHYRRECLLKLNEQAAELF 

ESGnDKJlVNNGLIIft.lNEFIVPFLPLLLVDEMEE 

KDILAVEDMRNRWCSYLGQEMEPHLQEKLT 

DFLPKLLDCSMEIKSFHEPPKLPSY STHELCER 

FARIMLSLSRTPADGR 


98 


1448 


A 


1304 


118 


453 


SGPSSRATYLHRKEYSQNLTSEPTLLQ1 1RVEH 
LMTCKQGSQRVQGPEDALQKLFEMDAHGRV 
WS QDLILQ VRDGWLQLLDIETKEELDS YRLD 
S1QAMNVALNTCSYNSILS 


99 


1449 


A 


1306 


3 


1660 


CGYFCHTTCAPQAPPCPVPPDLLRTALGVHPE 
TGTGTAYEGFLSVPRPSGVRRGWQRVFAALS 
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DSRLLLFDAFDLRLSPPSGALLQVLDLRDPQF 

SATPVLASDVIHAQSRDLPRJFRVTTSQLAVPP 

TTCTVLLLAESEGERERWLQVLGELQRLLLD 

ARPRPRPVYTLKEAYDNGLPLLPHTLCAAILD 

QDRLALGTEEGLFVIHLRSNDIFQVGECRRVQ 

QLTLSPSAGLLWLCGRGPSVRLFALAELENI 

EV\EVPKIPESRGCQVLAAGS1LQARTPVLCVA 

VKRQVLCYQLGPGPGPWQRRIRELQAPATVQ 

SLGLLGDRLCVGAAGGFALYPLLNEAAPLAL 

GAGLVPEELPPSRGGLGEALGAVELSLSEFLL 

LFTTAG1 YVDGAGRKSRGHELLWPAAPMG W 

GYAAPYLTVFSENSIDVFDVRRAEWVQTVPL 

KKW RPLNPEGSLi-'L YGTEK VRL T YLRNQL AE 

KDEFDIPDLTDNSRRQLFRTKSKRRFFFRVSE 

EQQKQQRREMLKDPFVRSKLISPPTNFNHLV 

HVGPANGRPGARDKSP 


100 


1450 


A 


1318 


918 


190 


SLC VPGPVDTGTF A VMSVMVGS VTESI . APQ A 
LNDSMINET ARD AARVQ VASTL S VL VGLFQ V 
GLGLIHFGFWTYLSEPLVRGYTTAAAVQVF 
VSQLKYVFGLHLSSHSGPLSLIYTVLEVCWKL 
PQSKVGTWTAAVAGWLWVKLLNDKLQQ 
QLPMPIPGELLTLIGATGISYGMGLKHRFEAGV 
PP V APNTQLFS KL VGS AFTI A WGF AI AI SLGK 
IFALRHGYRVDSNQVWVMRDV 


101 


1451 


A 


1353 


220 


445 


DWPDLFTYPHGSPKCFQSARPE\RMYRRTVR 
S SHGNHALQE VLPRSGHGTEFTKQKHLEAAD 
HGHPPARMSIFSR 


102 


1452 


A 


1363 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGEN 

WIFGDFMCKF1RFSFHFNLYSSILFLTCFSIFRY 

C VIIHPMSCFS1HKTRC A WAC A WW II SL V A 

VTPMTFLlTSTNRTNRSACLDLTSSDELNTIKW 

YNL1LTA\LLCLPLVIVTLCYTTIIHTLTHGHAN 

\DSCLKQKARRLTTLLL 


103 


1453 


A 


1371 


2 


410 


CHSTESSSDFILPGDYLLGGLCPLHSGCLQV\C 
SFNEHGYHLFQAMRLAVEEINNSTALLPN ITL 
GYQLYDVCSDSANVYATLRVLSLPGQHHIEL 
QGDLLJrYSPTVLAVIGPDSTNRAATTAALLSP 
FLVPMLLEQ 


104 


1454 


A 


1376 


3 


432 


NSRVEDRS/NmSLWTQNITVCPVRNVTRDGG 
FGPWSPWQPCEHLDGDNSGSCLCRARSCDSP 
RPRCGGLDCLGPAIHIANCSRNGAWTPWSSW 
ALCSTSCGIGFQVRQRSCSNPAPRHGGRICVG 
KSREERFCNENTPCPVPIF 


105 


1455 


A 


1379 


2 


396 


CjLGLLYLIrAAVfcOVMRVIOubNHLAV VLDD 
IILAVIDSIFVWFIFISLAQTMKTLRLRKNTVKF 
oL, YJK-rlrlvN 1 LLr A V LASIVrMuW 1 1 lv 1 r K1AK- 

CQSDWMERWVDDAFWSFLFNSLILIVIMFLW 
RPSA 


106 


1456 


A 


1383 


1 


432 


EDGHGG WS SRCLVDHAEEGHREPWKRLCI W 

CPPVSSLHFISLQ/RLPRDCQELFQVGERQSGL 
FEIQPQGSPPFLVNCKMTSGTFWTCRTDSRVF 
QN ANPSNAAH SEDQPTP 


107 


1457 


A 


1386 


719 


558 


FFFVTRSHSVAQAECSGVFTAHRSLDLVGSSN 
YPALSLQS S WDHRHTWLIF AFL 


108 


1458 


A 


1397 


631 


2 


RVAJSLLCAAIFISFMVQSAGKRWPTGVMLM 
W VLFAFL YSWPI QALLPTYLKTDLA YNPHT 
VANVLSFSGFGAAVGCCV/GGFLGDWLGTRK 
AYVCSLLASQLUIPVFAIGGANVWVLGLLLF 
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FQQMLGQGIAGILPKLIGGYFDTDQRAAGLG 
FTYNVGALGGALAPIIGALIAQRLDLGTALAS 
LSFSLTFWILRNRRPGKSLVR 


109 


1459 


A 


1402 


15 


387 


VLVALPDTNVTSETVVTEVLGHRVTLPCLYSS 
WSHNSNSMCWGKUQCPYSGCKEALIRTDGM 
RVTS RKSAK YRLQGTIPRGDVSLTE.NPS ESDS 
GVYCCRIEVPGWFNDVKINVRLNLQRASTT 


110 


1460 


A 


1421 


3 


350 


HEDLS SLLTRG SGNQERERQLKKLISLRDWM 
LAEL AFPVGVL ATCA* SLLSC* YCVILFPCSCF 
FFHSPDALFSLLLLSCYFPSYCFFYYLFFSSSPL 
CLLLASSPFPLFrLLASL 


111 


1461 


A 


1426 


2 


344 


FTSTMTKPFEKESEQPA*ATLAFGAQTSTTAD 
QCALKPDLSYLNNSSSSSSTPATSAGGG1FGSS 
TSSSNPPVATFVFGQSSDPVSSYGFVNTAESST 
SDSLLFSQDSKLATTS 


112 


1462 


A 


1434 


46 


372 


TTSWTTSCTRSCT* SG ASSGPG WTPRTTW WR 
SRRSSQRTCSRACSGAWSRTW*RSS*TSSSSC 
STSCSSSSSRSCGRPGGPLGARGVHITSCLNSC 
MSSSTTSSTTSTF 


113 


1463 


A 


1439 


3 


292 


HFJ5IMTHYDRLVDE* ALNAGKQRYEKMI SG 
MYLGEIVRNIL1DFTKKGFLLRGQISEMLKTR 
GIFLTFLL SNFLI VC VLLFYVSFYLFQSCINF VL 


114 


1464 


A 


1463 


1 


396 


KQQAVPEPHSSTTTPQEQEQNWYGQDLLNLQ 

QRTKVHLPGHKTGPAVAKDTPEPVKKEFTVP 

ATSQGP*SPFSEEPPLPPSNEEVPPTLPP*EPQS 

EDP*KNA*LK.QMHAATTHWQQHQQHQVGC 

QYHGIMQ 


115 


1465 


A 


1464 


291 


2 


AGSYPSMVWSCHWGVTQKRRAL*VYSFEEG 
GRRKCGQYWPLEKDSR1RFGFLTVSNLGVEN 
MNHYKKSTLEILNPEVNPGFFFXTLWKQGEN 
NYCN 


116 


1466 


A 


1465 


667 


337 


LPPQRPA* TDS Y STCNVS SGFLAGQ SHNIHLQ 
YWTKYQVWEWLQHFLDTNQLDANCEPFQEF 
DINGEHLCSMSLQEFTRAAGTAGQLLYSNLQ 
HLKWNGDSLFLCLSLPC 


117 


1467 


A 


1479 


1 


381 


GTSGGPKRVLVTERFPWQNPLPVNRGQAQR 
VLGPSNSFQRVPLQAQKLVSSHKPGQNQKHK 
QLQATSVPHPVCMPLNNTQKSKQPLPSAPEN 
NPEEELASDPNNEESL * RPWALEDFEIGRPLG 
KGK 


118 


1468 


A 


1485 


3 


385 


TYLWL*GNPPFYEKNDGGLFELILRAKDEFNS 

PYWDDMSDSAKHFIRPLTGRDP'KPFPCDQPL 

QHPWIEGHTCLDNNIHQAASEPINNNFARSKR 

NLAFLATGVVRHMRKLFMGANLEGPGPTVS 

H 


119 


1469 


A 


1486 


1 


398 


GTTSKHH*LARSLIRGPFDHDLKPNAATRDQL 

NirVSYPPTKQLTYEEQDLGWKFRYYLTNQE 

KALTKFLKWVNWDLPQEAKQALELLGKWK 

PMDVKDSLELLSSHYTNPTVRRYAVARLRQA 

DDEDLLMYL 


120 


1470 


A 


1497 


3 


999 


MGESPAV* GYFVLAGMNSAGLSFGGGAGKY 

LAEWMVHGYTSENVWELDLKRFGALQSSRT 

FLRHRVMEVMPLMYDLKVPHWDFQTGRQL 

RTSPLYDRLDAQGARWMEKHGFERPKYTVP 

PDKDLLALEQSKTFYKPDWFDIVESEVKCCK 

EAVCVIDMSSFTEFErrSTGDQALEVLQYLFS 

NDLDVPVGH1VHTGMLNEGGGYENDCS1ARL 

NKRSFFMISPTDQQVHCWAWLKKHMPKDSN 

LLLEDVTWKYTALNLIGPRAVDVLSELSYAP 
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MTPDHFPSLFCKEMS VG YANGIRVM SMTHT 
GEPGFMLYIPEEYRWGFTMLSTLVSNS 


121 


1471 


A 


1498 


3 


306 


AQFLLVGWDHIL*LrVL*TNLTELGRTTCDQN 
WPNSPDVLNHGCFYMQCLSKDCTIGYVSRE 
MLVAHTHTVEEHTGTHLQYVSWPDHSVPDD 
SSDFVEFEN 


122 


1472 


A 


1533 


121 


329 


LGLFSFVWTEVLEEPKDFSCETEDFKTLHCT 

WDPGTDTALGWSKQPSQSYTLFES*VGSGYII 

DNFFLA 


123 


J473 


A 


1547 


111 


408 


DARTTWKPRNGSSGIWPGDGAK*PPAVEQAE 
RGHVEM1EKLTFLNLHTSEKDKGGNTALHLA 
AKHGHSPAVQVLLAQWQDINEMNEKQQTPL 
IIVAADRG 


124 


1474 


A 


1555 


1 


745 


MTFDDDDKNTYGV AL VWKKFQTQ SLRL SDL 

HRKSHL WRGI V SITLI EGRDLKAMD SNGLSDP 

YVKPRL.GHQKYKSKIMPKTLNPQWREQFDF 

HLYEERGGVIDrTAWDKDAGKRDDFIGRCQV 

DLSALSREQTHKLELQLEEGEGHLVLLVTLT 

ASATVS1SDLSVNSLEDQKEREEILKRYSPLRI 

FHNLKDVGFLQVKVIRAEGLMAADVTGKSD 

PFCVVELNNDPXLTOTVYKNLNPEWNKVFTL 

* V ALV WKKFQTQSLRLSDLHRKSHLWRGIVS 

ITLIEGRDLKAMDSNGLSDPYVKFRLGHQKY 

KSKIMPKTLNPQWREQFDFHLYEERGGVIDIT 

AWDKDAGKRDDFIGRCQVDLSALSREQTHK 

LELQLEEGEGHLVLLVTLTASATVSISDLSVN 

SLEDQKEREEI LKRY S PLRTFHNLKD VGFLQ V 

K VIRAE GLMAAD VTGK S DPFC WELNNDRL L 

THTVYKNLNPEWNKVFTL 


125 


1475 


A 


1556 


57 


509 


GGP APN SR YAEr * KNS L AMT* AHADCENYV A 

CGGLDNTCSIYNLKTREGNVRVSRELPGHTGY 

LSCCRFLDDSQIVTSSGDTTCALWDIETAQQT 

TTFTGHSGDVMSLSLSPDMRTFVSGACDASS 

KLWDIRDGMCRQSFTGHVSDINAVS 


126 


1476 


A 


1592 


3 


178 


KSEKSCVSSLAHFGTSCQRDYDAMVKLVETL 
EMLPTCDL ADQHNIKFHY AFALNR* ER 


127 


1477 


A 


1612 


1 


497 


TESPLLVRPYLPYITKSELHAIMTAGFST1AGS 

VLGAYISFGVPSSHLLTASVMSAPASLAAAKL 

FWPETEKPKITLICNAMKMESGDSGNLL+AAT 

QG AS SS ISL VANIA VNLI AFL ALLSFMNS AL A 

WVGNMFDYPQLSFELICSYIFMPFSFMMGVE 

WPDSFM 


128 


1478 


A 


1619 


286 


486 


CCMN SKAQES VFKN VLCNPP ALSEMPD VKA 
EDEVDFRASSISEEVAVGSIAATLKMKQGPM 
TQAfNR 


129 


1479 


A 


1627 


1 


395 


PTRGALRYWIFGRFLCNIWAAVDVRCCTATI 
MGLCIISIDRYVGVSYPLRYPTrVTQRRGLMA 
LLCVW AL SLVI Yl GPLLG WRHP APEDETI CQI 
NEEPG Y VLF STPGSK YLPL ATML V MN * RVYRV 
AKTE 


130 


1480 


A 


1638 


2 


466 


DPRVRTKJVNRKTTIYEIQDKTGSMAWGKG 

ECHNIPCEKGDKLRLFCFRLRKRENMSKLMS 

EMHSFIQIQKNTNQRSHDSRSMALPQEQSQHP 

KPSEASTTLPESHLKTPQMPPTTPSSSSFTKVT 

KDKDIK*LLFNLYSSVErLPEVLHLKT 


131 


1481 


A 


1651 


607 


3 


LAEG GDVFDC VLNG GPLPESRAKALFRQMVE 
ATRYCHGCG V AHRDL KCEN ALLQGFNLKLTD 
FGFAKVLPKSHRELSQTFCGSTAYAAPEVLQ 
GIPHDSKKGDVWSMGWLYVMLCASLPFDD 
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TDIPKMLWQQQK.GVSFPTHLSISADCQDLLK 

PJXEPDMTLRPSIEEVSWHPWLAST**KQWQV 

LSNKVGGESKPKKKK 


132 


1482 


A 


1656 


150 


48 


LVAKSLLYCGCLFFLLQLAKNVGNNSFNDIM 
EANLTSPSPKPTPSSDM*VFLrY*TYFGAWHV 
VDAQ 


133 


1483 


A 


1660 


3 


406 


RKHDCLLIQKLSDVP*ECQNNQL*ICLTEICEKE 
KKEFKKKMDDQRPEKITEA* SKDKSPMEEEK 
TEMTRSYIQEVGRYIKRLEEAQSKRLEKLREK 
HKEIRQPILDEKPKGEGSSSFLSETCHEDTSWF 
PNFTP 


134 


1484 


A 


1666 


1276 


466 


PGSTHASARITlY*L*irLSNATEVDNNFSKPPP 

FFPAGAPPASSSSSSSSSSPPTVSTAPPL1PPPGF 

PPPPGAPPPSLIPTIESGHSSGYDSRSARAFPYG 

NVAFPHLPGSAPSWPSLVDTSKQWDYYARSS 

SSSSSSSSSSSSSPRDRDRER*RTRERERERDHS 

PTPSVFNSDEERYRYREYAERGYERHRASRE 

KEERHRERRHREKEETRHKSSRSNSRRRHESE 

EGDSHRRHKHKKSKRSKEGKEAGSEPAPEQE 

STEATPAE 


135 


1485 


A 


1673 


1 


417 


PTRPVNSSQAFALVYYTLGALGGNLIAHMGL 
GYRYWAGIGVLQSCESALTHYRLVANHVAS 
DISLTGGSVVQRIRLPDEVENPGMNSGMLQE 
DLIQYYQFLAEKGDVQAQVGLGQLHLHGGR 
G V* QNHQRAFDYFNL AA 


136 


1486 


A 


1678 


525 


9 


ANTSLSSAAVSAVSPPPCRTSTATTLPPPMPSF 

FCVFPSPSMSPSPSEFLSCIASVSRVHSLSSSSS 

GSSSTASSLNFSAIMGSSSATASWVLSTASTPP 

CPSALPSSPAQES* SLAASSSA WPVAGISPSGA 

CTFPAGSASGAAKAPSPSWRCPSFRALFSLLD 

SSSLSL 


137 


1487 


A 


1680 


1 


2999 


AHRDE1QRKFD ALRN SCTVITDLEEQLNQLTE 

DNAELNNQNFYLSKQLDEASGANDE1VQLRS 

EVDHLRRErTEREMQLTSQKQTMEALKTTCT 

MLEEQVMDLEALNDELLEKERQWEAWRSVL 

GDEKSQFECRVREEQRMLDTEKQSRARADQ 

R1TESRQWELAVKEHKAEILALQQALKEQK 

IJCAESLSDKIJ^rDLEKKHAMLEMNARSLQQK 

LEI'ERELKQRLLEEQAKLQQQMDLQKNHIFR 

LTQGLQEALDRADLLKTERSDLEYQLENTQV 

LYSHEKVKMEGTISQQ'I'KLIDFLQAKMDQPA 

KKKKVPLQYNELKLALEKEKARCAELEEALQ 

KTRIELRSAREEAAHRKATDHPHPSTPATARQ 

QIAMSA1VRSPEHQPSAMSLLAPPSSRRKESST 

PEEFSRRLK£RMHHNIPHRFNVGLNMRATKC 

AVCLDTVHFGRQASKCLECQVMCHPKCSTC 

LPATCGLPAEYVTHFTEAFCRDKMNSPGLQT 

KEPSSSLHLEGWMKVPRNNKRGQQGWDRK 

YTVLEGSKVLIYDNEAREAGQRPVEEFELCLP 

DGDVSLHGAVGASELAKTAKADVPYILKMES 

HPIITTCWPGRTLYLL APS FPDKQRWVT ALES 

WAGGRVSREKAEADAKLEGNSLLKLEGDD 

RLDMNCTLPFSDQWLVGTEEGLYALNVLK 

NSLTHVPGI GAVFQIYIIKDLEKLLM1AGEERA 

LCLVDVKKVKQSLAQSHLPAQPDISPNTFEAV 

KGCHLFGAGKIENGLCICAAMPSKWILRYN 

ENLSKYCIRKEIETSEPCSCIHFTNYSILIGTNK 

FYErDMKQYTLEEFLDKNDHSLAPAVFAASS 

NSFPVSrVQVNSAGQREEYLLCFHEFGVFVDS 
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nucleotide insertion 














iOKKsKI UULK w aKL^LAf A i Kbr Y Lr V 1 hr 
NSLEV1EIQARSSAGTPARAYLDEPNPRYLGPA 
ISSGAIYLASSYQDKLRVICCKGNLVKESGTE 
HHRGPSTSRR'PASPLPQYQGQRAFLQGRRK 


138 


1488 


A 


1686 


2 


526 


GRPQGPAPGAGSPPESGPGLWAALGCSLVWV 

PLCCLGGAAGRL*ARSGKSGLRRRRAHAGPP 

PGGPCNSCP'CSAPESGGRGPLPGPGTGGVCS 

CWTRGCQTTARTAAAAAAPGPAGRRPPGGA 

PQNGSCAASASQEAAAPPPMCPPGRRWAVAS 

PPETRCPAAPGTRCRRLEAA 


139 


1489 


A 


1693 


3 


376 


LPSMSNCTSCFRLQSRTES*IRQAGHLLGRNE 
FIETKALGCAWFSLCYYLVLYFESSHKVDFVF 
IV * CFSTPPG AQMTIMSQ AC AERCN1MRLVDR 
RWAGIAKGVGTQKIIGRVHLGEQKALGL 


140 


1490 


A 


1704 


3 


376 


ERTNKFIKELIMDGKNLIAATXSLSVAQRKFA 
HSLRDFKFEFIGDAVTDDERC1DASLREFSNFL 
KNLEEQREIMVS*EGCKLISQLSRGKKJWIWK 
LVLVEVVXHLSLGTVVHCNGKMRFPEP 


141 


1491 


A 


1743 


1 


362 


LITNKVFVARELSCLDVHLDSTGSTAVVADQ 
DKLELELVLKGSYEDTQTSFLGTASAFRFHY 
MAAL*TELSGRLRSSKSNGWNGDNSTGYLTV 
PLRPLTIVKEVTMDVPAPNVRGLNWMG 


142 


1492 


A 


1769 


1 


406 


NNPSTLPRGS*PMSPRTTMGRRRQRRREHKSS 

LSLASSTVGPGGQIVHTETTEWLCGDPLSGF 

GLQLQGGIFATETLSSPPLVCFIEPDSPAERCG 

LLQVGDRVLSINGIATEDGTMEEANQLLRJDA 

ALAHKW 


143 


1493 


A 


1789 


1 


447 


QMLRNGGDQNTVPDYHFADRIRELL* PTEDQ 
KNCIP*DTYLRPSALGNIVEEVTHPCSPGPCPA 
NELCE VNRKGCTSGDPCLPYFCV QGCKLGQA 
SDFIARQGTLIQVPSSAGEVECYK1CSCGQSGL 
LENCMEMHCMDLPTDTSALVR 


144 


1494 


A 


1814 


1 


404 


PGRRFRPRLSQAGTDSGS* VFFDSFPSAPAEPL 

pyflqepqdayrvknkpvelrcrafpatqiyf 
kcngewvsqndhvtqegldeatglrvrevh 
ievsrqqveelfgledywcqcvawssagttk: 
srrayvrt 


145 


1495 


A 


1827 


26 


448 


XVEEKI IADTWRS XC L S D FFFH A AKXLC X E * N 
CGDAISLSVGDHFGKGNGLTWAEKFQCEGSE 
TI ILALCPIVQI IPEDTCIHSREVG WCSRYTD V 
RLVNGKSCK^GQVErNVLGHWGSLCDTHWX> 
PEDARVLCRQLNCGTAL 


146 


1496 


A 


1828 


574 


333 


QHEGGDLRRRQLGEIQLTVRYVCLRAASAC* 
SMAAET* HHVPASGADPYVR V YLLPERK W A 
CRKKTSVKRKTLEPLFDET 


147 


1497 


A 


1855 


1 


372 


ERLVLTSEHCLVLTLFWPSWTYHTLLLSRQH 

VRRJ.PKLTHA.KHDHLASIMNICLLTNYDNLFE 

TSVTYSMG*HGAPTGSEAGANWNH**JLHAH 

YYPPLLRSDTVRKFMVGSQMLAQAQRDLTPE 

Q 


148 


1498 


A 


1879 


568 


7 


LLSALDDKGGTQPSASFSNAPTIVCVTACPAG 

IAHTYMAAEYLEKAGRKLGVNVYVEKQGAN 

GIEGRLTADQLNSATAC1FAAEVAIKESERFN 

GIPALSVPVAEPIRHAEALMQQALTLKRSDET 

RTVQQDTQPVKSVKTELKQALLSGISFAVPLI 

VAGGTQV A* AV* RQGI SSLHD VQ VRTWN S 


149 


1499 


A 


1880 


611 


24 


GLNSENAL SNE AMERGWQCLRI .FAERLQD I P 
PSQIRWATATLRLAVNAGDFIA1CAQEILGCP 
VQV1SGEEEARLIYQGVAHTTGGADQRLVVD 
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IGGASTELVTGTGAQTT*LFSLSMGCVTWLER 
YFADRNLGQENFDAAQKAAREVLRPVADEL 
KYHaWrUbVKuA^ V 1 V V^Ai^y E.lMivlA^OrvliJt, 

RITMEIWPVD 


150 


1500 


A 


1894 


2 


750 


GRVDFFHTDYRPLIRDSNNYVLDEQTQQAPH 
LMPPPFLVDVDGNPHPTKYQRLVPGRENSAD 
EHLIPQLGYVATSDGEVIEQIISLQTNDNDERS 
PESSILDGMrRQLQQQQDQRMGADQDTIPRG 
LSNGEETPRRGFRRLSLD IQSPPNIGLRRSGQ V 
EGVRQ^fflQNAPRSQIATERDLQAWKRRVVv 
PEVPLGIFRKI .RDFRI .EKGEEERN1 -YIIGRKRK 
TLQLSHKSDSVGLVSQSRPRTCRRKYP 


151 


1501 


A 


1900 


141 


785 


GKTIQIQTTMQNKYKTVQKQYKTIPKNKKA 

MEMQIKKQFQDTCKVQTKQYKALKNHQLEV 

TPKNEHKTILKTLKDEQTRKLAILAEQYEQSI 

NEMMASQALRLDEAQEAECQALRLQLQQEM 

ELLNAYQSKIKMQTEAQHERELQKLEQRVSL 

RRAHLEQK.IEEEL AAL QKLRb EKUS^NLLbRQE 

REIETFDMESLRMGFGNLVTLDFPKEDYR 


152 


1502 


A 


1915 


2 


377 


LVRLLDTQRDGLQNYEALLCLTNLSGRSDKL 
RQKIFKERALPDIENYMFENHDQLRQAATEC 
MCNM VLHKJiVQtRr LAlJUNDKX 
EDDDKVQNAAAGALAMLTAAHKKLCL1CMT 
QVTT 


153 


1503 


A 


1921 


1 


237 


AYQSLRLEYLQ1PPVSRAYTTACVLTSAAVQL 
ELITPFQLYFIPELIFKHFQIWRLITNFLFFVPFG 
FNFLLYMIFLYT 


154 


1504 


A 


1928 


2 


354 


EMVEGGEGKMCINTEWGGFGDNGCIDDIRTR 
YDTE VDEGSLNPGKQR YEKMTSGMYLGEIV 
RQILIDLTKQGLLFRGQISERLRTRGIFETKPLS 
QIESDRLALLQVRRILQQLGLD 


155 


1505 


A 


1929 


2 


369 


TElAKIKMEAKi^YEI^LTMrQNlJrEKJVCQA 
KSEALVLREKSTLERIHKHQEIETKEIYAQRQ 
LLLKDMDLLRGREAELKQRVEAFESYQLELK 
DDYIIRTYRLIEDDRTNTQISGHWQESP 


156 


1506 


A 


1935 


1 


270 


VTRKLPIF I VD AFT ARAFRG SP AADCLLENEL 
DbDMHQKIAREMNL.bETl ArlKALrlr I DiNr A<j 
RSCFGLIWKI'PnDLQILTSSrLPSIL 


157 


1507 


A 


1936 


584 


305 


ESKVNNEKFRTKSPFCPAESPQSATKQLDQPTA 
AYEYYDAGNHWCKDCNTICGTMFDFFTF1MH 
NKXHTQGQFQKSSDFQKEELQQTFLPPERQG 


158 


1508 


A 


1939 


1 


423 


l lnKLNV TAliPrCl &MP1Y wMrL)VPrlKL-l 1A 
NTCPVDLTDYCAQNGFYCL VYGFLPYGSLED 
RLHCQTQACPPLSWPQRLDILLGTARAIQFLH 
QDSPSLIHGDIKSSNVLLDERLTPKLGDFGLA 
RFSRFAGSSPIQSSM 


159 


1509 


A 


1974 


3 


401 


ti l oTAKLLLriKOAOKxiA V i oLKj I I A_LrxL AAK 
NGHLATVKLLVEEKADVLARGPLNQTALHL 
AAAHGHSEVVEELVSADVIDLFDEQGLSALH 
T AArVlPl-TAfVrVFTT T TJT-iriAHTMT fl^T lfFAAri 

HGPAATLLR 


160 


1510 


A 


1982 


2 


417 


KFLKDLEKQYNKEEPHLSEIGSCFLQNQEGFA 
IY SE YCNNHPG ACLELANLMKQGK YRHFFEA 
CRLLQQMIDIAIDGFLLTPVQK1CKYPLQLAEL 
LKYTTQEHGDYSNIICAAYEAMKNVACLrNER 
KRKLES1DKIA 


161 


1511 


A 


1984 


4 


770 


RETGSVSLSPSGLEGAESYAVSP1LYSSPDVKJE 
LWLETLQGQRHSHTGVKSTPGQSAAILMKLR 
SSHNASKTXNANNNIETLIECQSEGDIKEHPLL 
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ASCESEDSICQLIE VKKRKK VL S WPFLMRRL S 
PASDFSGALETDLKASLFDQPLSIICGDSDTLP 
RPIQDILTILCLKGPSTEGIFRRAANEKARKEL 
KEELNSGDAVDLERLPVHLLAVVFKDFLRSIP 
RKLLS SDLFEEWMGALEMQDEEDRTE ALK 


162 


1512 


A 


1986 


864 


501 


LLNSGLFSAPDGSNLEMRLTRGGNMCSGRIEI 
KFQGRWGTVCDDNFN1DHASVICRQLECGSA 
VSFSGSSNFGEGSGPIWFDDLICNGNESALWN 
CKHQGWGKHNCDHAEDAGVICSSKD 


163 


1513 


A 


2001 


419 


187 


AVDLSroESSLTGETTPCSKVTAPQPAATNGD 
L ASRSNI AFMGTL VRCGKAKG V VJG TGENSE 
FGDHNLSTFWHS 


164 


1514 


A 


2012 


284 


597 


SLLCLFPGTSTVVCKPIVIETQLYVTVAQLFGG 
SHIYKRDSFANKFIK1QA1EILKIRKPNDIETFKI 
ENNWYFVVADSSKAGFTTIYKWERETGFYSH 
QSFTR 


165 


1515 


A 


2013 


2 | 403 


EDPEELGHFYDYPMALFSTFELFLTIIDGPANY 
NVDLPFMYSirYAAFAIIATLLMLNLLIAMMG 
DTHWRVAHERDELWRAQrVATTVMLERKLP 
RCLWPRSGICGREYGLGDRWILRVEDRQDLN 
RQRIQRYA 


166 


1516 


A 


2019 


2 


927 


CCQREGLGLKAWQILLSHGRNGLPGEPASS 

QGLSAASSTPVFHLALQIDSAPDNEDWVEMLF 

NKNMVTERLQNVMVLEQCFSDSSSLYRFLTY 

SYLLAFNVWLLLAPVTLCYDWQVGSIPLVETI 

WDMRNLATIFLAVVMALLSLHCLAAFKJRLE 

HKEVLVGLLFLVFPFIPASNLFFRVGFWAER 

VLYMPSMGYCILFVHGLSKLCTWLNRCGATT 

LIVSTVLLLLLFSWKTVKQNEIWLSRESLFRS 

GVQTLPHNAKVHYNYANFLKDQGRNKEAIY 

HYRTALNNNKAWDYLCWRFRKTLrDLP 


167 


1517 


A 


2025 


696 


71 


AAASAAS SLTVTLGRLAS ACSHSILRPSGPG A 

ASLWSASRRPNSQSTSYLPGYVPKTSLSSPPW 

PEWLPDPVEETRHHAEVVKKVNEMIVTGQY 

GRLFAWHFASRQWKVTSEDLILIGNELDLA 

CGERJRLEKVLLVGADNFTLLGKPLLGKDLV 

RVEATVIEKTESWPRJIMRFRKRXNFKKKRIV 

TTPQTVLRINSIEIAPCLL 


168 


1518 


A 


2046 


2 


366 


HLQVAARVFMPLQAVDSAPK.PLKGQAQAPQ 
RLOGAARVFMPLQAQVKAKASKPLQMQIKA 
PPRLRJRAARVLMPLQAQVRAPRLLQVQSQVS 
KKQQAQTQTSEPQDLDQVPEEFQGQDQVLR 


169 


1519 


A 


2049 


1 


945 


QNLEDRE VLNG VQTELLTSPRTKDTL SDMTR 

TVEISGEGGPLGIHWPFFSSLSGRILGLFIRG1 

EDNSRSKREGLFHENFXrV^rNNVDLVDICTFA 

QAQDVFRQAMKSPSVLLHVLPPQNREQYEKS 

VIGSLNJFGNNDGVLKTKVPPPVHGKSGLKTA 

NLTGTDSPETDASASLQQNKSPRVPRLGGKPS 

SPSLSPLMGFGSNKNAKK1K1DLKKGPEGLGF 

TVVTRDSSIHGPGPIFVKNILPKGAAIKDGRLQ 

SGDRILEVNGRDVTGRTQEELVAMLRSTKQG 

ETASLV1ARQEGHFLPRELVMFRSQSH 


170 


1520 


A 


2050 


363 


1 


PVATHLTKJLNSDEHAVVISSAKTLCETVXDF 
VAK VEKTYDKTLEN A WADA VASKCS VLNE 
KLEQLLQALHTDSQAAPVLPGLSPLIVEEDAV 
ESSSEESLGESKEQLGDDVTKPSSQKA 


171 


1521 


A 


2055 


139 


675 


IPSRPWLGR1TGLDPAGPLFNGKPHQDRLDPS 
DAQFVDV1HSDTDALGYKEPLGN1DFYPNGG 
LDQPGCPKTILGGFQYFKCDHQRSVYLYLSSL 
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RESCTITAYPCDSYQDYRNGKCVSCGTSQKE 
SCPLLG YYADNWKDI ILRGKDPPMTKXFFDT 
AEE SPFCMYT rVFVDHTWNKN VR 


172 


1522 


A 


2056 


3 


361 


L1QHKSAVEYAQSHLSLVSMCKESHKCSEPK 
MEWKVIORSDGTRYITKRPVRDRILKERALKI 
KEERSGLTTDDDTMSEMKMGRYWSKEERKQ 
HLVRGKEQRRRREFMMRIRLKCLKES 


173 


1523 


A 


2060 


1 


387 


GTRILSMQIPFVGFQPIRTSEHMAAAGVFALL 
QAYAFLQYLRDRLTK.QEFQTLFFLGVSLAAG 
A VFL S VI YLTYTGYIAPWSGRFYSL WDTG Y A 
KIHIPI1ASVSEHQPTTWVSFFFDLHILGCTFPA 
G 


174 


1524 


A 


2071 


74 


443 


LLMGPKAKKSGSKXKKVTKAERLKLLQEEEE 
RRLKEEEEARLKYEKEEMERLEIQRIEKEKW 
HRLEAKDLERRNEELEELYLLERCFPEAEKLK 
QETKLLSQ WKHYIQCDGS PDPS V AQEMNT 


175 


1525 


A 


2083 


139 


486 


AALTWSQPQEFWPMEMQPIVTDMVTVHWV 
AES ST VG WLCALFRVTHVG VG ATGHG WCG 
RRVLCGLPLPSPAPMPLMSLPEGESRKEREVQ 
RLQFPYLEPGHELPATTLLAFLAAV 


176 


1525 


A 


2092 


3 


587 


EG S VNFKFG VLF AKDGQLTDDEMF SNEIGSEP 

FQKFLNLLGDTITLKGWTGYRGGLDTKNDTT 

GIHSVYTVYQGHEIMFHVSTMLPYSKENKQQ 

VERKRHIGNDIVTIVFQEGEES SP AFKP SMIRS 

HFTHIFALVRYNQQNDNYRLK1FSEESVPLFG 

PPLPTPPVFTDHQEFRDFLXVKLINGEKATLET 

PCI 


177 


1527 


A 


2103 


44 


427 


GKGQVSLEGRPHRGPLCLGSWWPGSRVPGC 

CDGAWLAWACWVFGNDFPSPASAACSALLG 

CSVSTACLCVPLCSGSPLAPFRRTAALQEGLR 

RAVSVPLTLAETVASLWPALQELARCGNLAC 

RSDLQ 


178 


1528 


A 


2104 


2 


409 


ALQSTLGAVWLGLLLNSLWKVAESKDQVFQ 
PSTAASSEGAWEIFCNHSV SNA YNFFWYLHF 
PGCAPRIXVKGSKPSQQGRYNMTYERFSSSL 
LILQVREADAA\' r YYCAV r EVPNTDKLIFGTGT 
RLQVFPNIQNPD 


179 


1529 


A 


2111 


1 


312 


PTRSSTRPPSLFVHASAKGGEKEEGDDGHYL 
MRTESHTGLKKGGNANLVFMLKRNTEPKKG 
SYHFDLERLRAAHILFEREQEHLAPGGISMPL 
PPPLPLPACLG 


180 


1530 


A 


2116 


3 


366 


TS IKRAIETTD VTR SFGWDSSEA WQQHD VQE 
L CRVMFDALEQKWKQTEQ ADL INEL YQGKL 
KDYVRSLECGYEGWRJDTYLDIPLV1RPYGSS 
QAFASWCTFHLTACVSLHRIHNSTW 


181 


1531 


A 


2117 


2 


386 


YGLGAHFGRLFIQAGrNENDFYDGAWCAGR 
NDLQQWTEVDARRLTRFTGVITQGRNSEWLS 
D WVTS YKVM VSNDSHTWVTGKNG SGDMIFE 
GNSEKEIPVLNELPVPMVARYIRJNPQSWFDN 

GSICI 


182 


1532 


A 


2123 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFVV'AVNAVH 

GWVLGKIMCKJTSALYTLNFVSGMQFLACISI 

DRYVAVTKVPSQSGVGKPCW1ICFCVWMAAI 

LLS1PQLVEYTVNDNARCIPIFPRYLGTSMKAL 

IQMLEICIGFVVPFLIMGVCYFrrARTLMKMP 

NIKIS 


183 


1533 


A 


2140 


3 


561 


RQAWHEAFKVRKEILTVICCLLAFCIGLIFVQ 
RS GNYF VTMFDD Y S ATL P L L ] V V CLENI IA V CF 
VVGIDiCFMEDLKDMLGFAPSRYY Y YM WKYI 
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SPLMLLSLUASWNMGLSPPGYNAWIEDKAS 
EE FLSY PT WGL A VC ASLD VF AILPVP V AFIGR 
RFSLIDDGAGPFCSAAYTTTGCRTPYL 


184 


1534 


A 


2145 


3 


538 


HELT V AAADRGQPPQ S S V VP VT VT VL D VND 

NPPVTTRASYRVTVPEDTPVGAELLHVEASD 

ADPGPHGLVRFTVSSGDPSGLFELDESSGTLR 

LAHALDCETQARHQLWQAADPAGAHFALA 

PVTIEVQDVNDHGPAFPLNLLSTSVAENQPPG 

TLVTTLHAIDGDAGAFGRLRYHL 


185 


1535 


A 


2151 


2 


671 


LDKLLDRMENYNIFNEYILKQVAATYIKLGW 

PKNNFNGSLVQASYQHEELRREVIMLACSFG 

NKHCHQQASTLISDWISSNRNRIPLNVRDIVY 

CTG VSLLDEDV WEFTWMKFH STTAVSEKKIL 

LEALTCSDDRNLLNRLLNLSLNSEWLDQDAI 

DVIIHVAJINPHGRDLAWKFFRDKWKILNTRI 

RQKTLEFDFAEPLILAFPIILYTAIDNPPLVREH 

E 


186 


1536 


A 


2153 


2 


400 


GPMCDKHSAFAEKFHAGF1DYTVHPLWETWA 
HLALPDAQDILYTLEDNRNWVDSMIPQSPSPP 
LDEQNRDWQGLLENLHVELTLDEEDSEGPEK 
EGEGQTYFTSSKTLCGIVPQNTDSLGETGIHIC 
AHDKSP 


187 


1537 


A 


2158 


227 


442 


FNCFRVASDSFLENSSLLIMILPLRNATQEFIIR 
PGAVAYTCNPSTLGGWGGWITRSGVRDQPG 
QHGGTPS 


188 


1538 


A 


2167 


3 


486 


AHLGGAWLTQRSLGSWAAPGPARAAKEVVA 
CIPQNQKMNI WRMKTSKHLQLLSF VLGA VS P 
AVVVPYMMVLQENGYGVEEGIPTLLMAASS 
MDDILAITGFNTCLSrVTSSGCARSSGSRNSKS 
LRTPLGTICEGCDD SS IFSHLDHSSK WS STYG 
HSGA 


189 


1539 


A 


2168 


2 


412 


EFLSSNQITQl^NTTFRPMPKLRSVDLSYNKL 

QALAPDLFHGLRKLTTLHMRANAIQFVPVRIF 

QDCRSLKFLDIGYNQLKSLARNSFAGLFKLTE 

LHLEHNDLVKVNFAHFPRLISLHSLCLRRNKV 

AIWSSLDW 


190 


1540 


A 


2179 


64 


399 


MRLNQKTLLLESFGXXRPYTSErlAPTYHQW 
MKADELLRWTTSEPLTLEHEYAMQRTWLED 
A YECTr 1 VLDAhJCRHAQPGA J LhSCMVGDVN 
LFLTDLEDLTLGEIEVLIAEP 


191 


1541 


A 


2190 


1 


469 


C L DRAA GIRHEPvNVI YTNETHTRHRG WLARR 
LSYVLFlQERDVHKGMFAmVrTENVLNSSRV 

qeaiaevaaelnpdgsaqqqskavnkvkkk 
axrilqemvatvspamirltgwvllkjlfnsf 

FWKIQIHKGQLEMVKAATETNLPLLFLPVHR 
SH 


192 


1542 


A 


2197 


26 


157 


PSKXGGIRLLLTGTQLYGRFGSAIAPLGDLDR 
DGYNGEGREEPY 


193 


1543 


A 


2236 


2 


383 


EYFPN SJ WRSLFSTMDLGDIGF Yl YRILQALS 

GLAEFYHPMRKYSVHVATRYYKSPEILLDYE 
YYDYSLDIWAVGVILLELLrLKLHVPEGGDN 
EQ 


194 


1544 


A 


2241 


105 


409 


RKGVGKMPTSEGRPGQERSDWVTS YKVMGS 
NDSHTWVTVKNGSGDMIFEGNSEKEIPVLNE 
LPVPMGARYIRINPQSWFDNGSICMRMEILGC 
PLPDPNNY 


195 


1545 


A 


2245 


1 


672 


MGVASDWTKRIEYQPGSGSMPLFPSIHLETCD 
G A VS S LQIVTELQTNYIGKGCDRETYSEKSLQ 
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KLCGASSGIIDLLPSPSAATNWTAGLLVDSSE 

MIFKjDGRQGAKlPDGrv PKNLTDQr J ITMW 

MKHGPSPGVRAEKETILCYSDKTEMNRHHY 

ALYVHNCRLVFLLRICDFDQADTFRPAEFHW 

KLDQQALAKVDGQPGKSrrRQLQEMPVTIQG 

ISLKPS 


196 


1546 


A 


2256 


1 


396 


FRGTPVSGLTNRDTLAVIRHFREPIRLKTVKP 
GK VINKDLRHYLSLQFQKG S1DHKLQQ VTRD 
NL YLRTIPCTTRAPRDG EVPG VD YNFI S VEQF 
KALEESGALLESGTYDGNFYGTPKPPAEPSPF 
QPDPV 


197 


1547 


A 


2259 


43 


594 


QLAIEIGVRALLFGVFVFTEFLDPFQRVIQPEEI 

WLYKNPLGQSDNIPTRLMFAISFLTPLAVICV 

VKIIRRTDKTEIKEAFLAVSLALALNGVCTNTI 

KLIVGRPRPDFFYRCFPDGVMNSEMHCTGDP 

DLVSEGRKSFPSIHSSFAFSGLGFTTFYLAGKL 

HCFTESGRGKS W RLC AA ILPL 


198 


1548 


A 


2275 


3 


404 


TCTTVVVIPRMLVDFLSESfCTTSLPECATQMFF 

FLGFASNNCFIMAAMSYDRYTAIHNPLQYHT 

LrvTTRKICLQMMMASWMVGFLFSLCirvrTVFN 

LSLCDLr^TrQHYFCDISPWSLACNYTFYHEM 

AIFVLSA 


199 


1549 


A 


2315 


1 


375 


LTQMFFIHALSAIESTILLAMAFDRYVAICHPL 
RHAAVLNNTVTAQIGIVAWRGSLFFFPLPLLI 
KRLAFCHSNVLSHSYCVHQDVMKLAYADTL 
PNWYGLTAILLVMGXDRMFISLSYFLII 


200 


1550 


A 


2334 


2 


409 


PRVRPQQRKMSFFFKTELGEKLVTKFLFETDF 
SDDPMLPSPDQLKKKAPFTNKKLKAHQTPVD 
ILKQKAHQLASMQVQAYNGGNANPRPANNE 
EEEDEEDEYDYDYESLSDDNELEDRPENKSCH 
DQLQFEYKEEM 


201 


1551 


A 


2350 


3 


512 


ISWEAQLAEIIQWVSDEKDARGYLQALASKM 
TEELEALRSSSLGSRTLDPLWKVRRSQKLDM 
S ARLEL QS ALEAEIRAK.QLVQEELRK VKDAN 
LTLE SKLKD SEAKNRELLEEMEILKKKMEEK 
FRADTGKLMLCDSALFEYKYFSNECFYFLFD 
LrVTLEAPTEFQIQY 


202 


1552 


A 


2351 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLL 

NRVLERLAGGATRDSAASDILLDDIVXTHSLF 

LPTEKFLQELHQYFVRAGGMEGPEGLGRKQA . 

CLAMLLHFLDTYQGLLQEEEGAGHIIKDLYL 

LIMKDESLYQGLREDTLRLHQLVETVELKIPE 

ENQPPSKQVKPI .FRHFRRIDSCLQTRVAFRGS 

DEIFCRVYMPDHSYVTIRSRLSASVQD1LGSV 

TEKLQYSEEPAGREDSLILVAVSSSGEKVLLQ 

PTED C VFT AL G IN SHLF A CTRDS YEAL V PLPE 

hIQVbrOlJlblHKVbrbL>VAiNHLl ArriWhLrK 

CVHELEFVDYVFHGE 


203 


1553 


A 


2361 


2 


403 


NNLNCAEPLFEQNNSLNVNFNTQKJCTVWLIH 
r.vcpvr.QTPi un owrvrtt t ivn?Fr>\/>JVTV\m 

\J I I\_T V UoirL W H^FlNr V In^LLjJjI 1 * tjErJ^'IVLlN V1V VU 

WSRGATTFIYNRAVKNTRKVAVSLSVHIKNL 
LKHGASLDNFHFIGGSLGAHISGFVGKIFHGQ 
LGRITGLDP 


204 


1554 


A 


2390 


280 


476 


SPSLLPQCLMSLSDLSLSPAPPSHLSPRCPSPQ 

AGSRLGAMRRCAREMDATPMPPAPSCPSERV 

T 


205 


1555 


A 


2400 


543 


745 


AA V ALRDI S WQQP YPMDF Y AG S SLGPWTVN 

HGQDRRPHAPGRPARGKVQEGSARPPSAVAC 

EDCSCR 
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206 


1556 


A 


2406 


122 


485 


DLSPDSREDHPQGHRRLLPKRPVRGSLMPG H 

1 xlxi V o j 1 x IN i-J 1 rUyi W V o V \J i\JV1 \_j 1 \j\J 

MGANASTSPRCWDLSSGNKKWI1QVPILASIV 
ESRG GLLATG VGGMC AC VPRNQPLTGT 


207 


1557 


A 


2409 


289 


418 


LWTLYRHKQQVQHNHSNRLSCRPSQEDRAT 
HTIMVLDKENTLS 


208 


1558 


A 


2413 


64 


492 


VQGTGXXF1AFTEAMTHFPASPVWAGMFFL 

MLINLGLGSMIGTN1AGITTPIIDTPKVPKEMFT 

GGCCVFAFLVGLLFVQRSGNYFVTMFDDYSA 

TLPLTLIV1LEKIAVAWIYGTKKPMQELTEML 

GFRPYRFYFYMWKFVSP 


209 


1559 


A 


2417 


3 


877 


EKERLLDEWFTLDEVPKGKLHLRLEWLTLMP 

NASNLDKVLTDIKADKDQANDGLSSALLILY 

LDSARNLPIRYKTNEPVWEENFTFFIHNPKRQ 

DLEVEVRDEQHQCPLGNLKVPLSQLLTSEDM 

TVSQRFQLGNSGPNSTIKMKIALRVLHLEKRE 

RPPDHQHSAQVKRPSVSKEGRKTSIKSHMSG 

SPGPGGSNTAPSTPV1GGSDKPGMEEKAQPPE 

AGPQGLHDLGRSSSSLLASPGHISVKEPTPSIA 

SDISLPIATQELRQRLRQLENGTTLGQSPLGQI 

QLT1P 


210 


1560 


A 


2422 


35 


456 


REFAASDLEPFTPTDQPISPEAITQPSCIKRQRA 
AGNPGSLAATIDHKPCSAPLEPKJQASRNQRW 
GAVRAAESLTDIAEPASPQVHETPIDASQTQK 
VEPASKSRFTPELQAKVSHSRERALSTMDATP 
HHAQPQRGEG 


211 


1561 


A 


2431 


1 


764 


RRYSQKLIQHTACQLLRTYPAATRIDSSNPNP 

LMFWLHGIQLVALNYQTDDLPLHLNAAMFE 

ANGGCGYVLKPPVLWDKNCPMYQKFSPLER 

DLDSMDPAVYSLT1VSGQNVCPSNSMGSPCIE 

VDVLGMPLDSCHFRTKPIHR>nT J NPN4WNEQF 

LFHVHFEDLVFLRFAVVENNSSAVTAQRIIPL 

KALKRGYRHLQLRNLHNEVLEISSLFINSRRM 

EENSSGNTMSASSMFNTEERKCLQTHRVIVH 

GVPG 


212 


1562 


A 


2436 


1 


411 


GIRGTTGHLGCPINDDPSLTLTVSWVMEDKPI 

YIGNGTKKEDDSLTIFAVAKRDHVSDTCGAC 

TDLDHNLDKGYLTVLGEQATPTNRLGALPKG 

RANRTRDLELTYLAERTVRLTWIPGDANNRPI 

TDYDCQIEEHQ 


213 


1563 


A 


2445 


1 


1294 


MSSIGCLWVSRSSQIDGLTAEKSGPEKPHGT 

WLMPELHPKEQDLELLVLEQFLSrLPEELQIWV 

QQHNPESGEESVTLXEDLEREFDDPGQQVPAS 

PQGPAVPWKDLTCLRASQESTDIHLQPLKTQ 

L¥Lb WKPLLbPKilJCliNSb I A 1 K±i(jlbfcfclCSQlj 

LPQEPSFRG1SEHESNXVWKQGSATGEKLRSP 

SQGGSFSQVIFTNKSLGKRDLYDEAERCLILT 

TDSIMCQKVPPEERPYRCDVCGHSFKQHSSLT 

QHQRIHTGEKPYKCNQCGKAFSLRSYLilHQR 

EKACKCNECGKAFSQSSYLIIHQRIHTGEKPY 
ECNECGKTFSQSSKLIRHQRIHTGERPYECNE 
CGKAFRQSSELITHQRIHSGEKPYECSECGKA 
FSLSSNLIRHQRJHSG 


214 


1564 


A 


2461 


1 


615 


G1PGSTISSSRNIFLEDDLAWQSLIHPDSSNTPL 
STRLVSVQEDAGKSPARNRSASITNLSLDRSG 
SPMVPSYETSVSPQANRTYVRTETTEDERKIL 
LDS VQLKDL WKKICHHSSGMEFQDHRYWLR 
THPNCI VGKELVN WLIRNGI IIATRAQ AJAIGQ 
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AMVDGRWLDCVSHHDQLFRDEYALYRPLQV 
LFSVYCQLECSKLEL 


215 


1565 


A 


2464 


3 


2932 


GPGVRSSQDGMADVFVHLRTAWPRCSFISGQ 

HGPGRHGRRVC SSQDSMAD VFVHLRTAWPT 

CSLISGQHGPGESVSYEDDDIPAPASLLHVNA 

AAPALTNPTAPVLCTAPNNTAQKEKVPSGMR 

QRPAGVRISSRTPDLTCAVSTHSTVPGVRISSC 

TPDLTCAVSIHSTWSVCISSCTPDLTCAVSTH 

STVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVS1HATVPGVRISSCTPDLTCAVSIH 

ATVPGVR1SSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTC AV SIHSTVPG VRIS SCTPDLTC A VSI H 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTC AVSIHATVPG VRISSRTPDLTCAV SIH 

ATVPGVRISSCTPDLTCAVSIHATVPGVRISSC 

TPDLTC A VSIHATVPGVR1SSRTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTC AVSIHATVPGVRISSCTPDLTCAVSTH 

STVPGVRISSRTPDLTCAVSIHATVPGVHISSC 

TPDLTCAVSTHSTVPGVR1SSRTPDLTCAVSII I 

STVPGVCISSRTPDLTCAVSIHSTVPSVHISSCT 

PDLTCAVSIHSTVPGVRISSRTPDLTCAVSTHS 

TVPGVHISSCTTDLTCAVSIHATVPGVHISSCT 

PDLTCAVSTHTTVPGVRISSRTPDLTCAVSIHS 

TVPGVRISSCTPDLTCAVSTHSTVPGVRISSRT 

PDLTC AVSTHLTVPGVRIS SRTPDLTC A VSIHA 

TVPGVHISSCTPDLTCAVSIHATVPGVRISSRT 

PDLTCAVSIHATVPGVHISSCTPDLTCAVSTHS 

TVPGVRISSRTPDLTCAVSIHSTVPGVHISSCT 

PDLTCAVSTHSTVPGVHISSCTPDLTCAVSTH 

STVPGVHISSRTPDLTCAVSIHATVPSVH1SSC 

TPDLTC A VSIHSTVPGLLTSVSQTSTG 


216 


1566 


A 


2477 


1 


414 


FRTKS YRKGS YRCIV SEWI AEQGNWQEIQEK 

AVEVATVVIQPTVLRAAVPKNVSVAEGKELD 

LTCNITTDRADDVRPEVTWSFSRMPDSTLPGS 

RVLARIJDRDFLVHSSPHVALSHVDARSYHLL 

VRDVSKENSGYYY 


217 


1567 


A 


2480 


2 


460 


CRTLCEGPQRFEEYEYLGYKAGLYEAIADHY 
MQVLVCQHECVRELATRPGRLSPIENFLPLHY 
DYLQFAYYRVGfiYVKALECAKAYLLCHPDD 
EDVLDNVDYYESLLDDML)rAMfc.AKfcL)L 1 Mr 
VKRHKLESELIKSAAEGLGX S YTEPNYW 


218 


1568 


A 


2483 


140 


383 


AFSSPHPSPAPQFPECGFYGLYDKILLFKHDPT 
SANLLQLVRSSGDIQEGDLVEWLSASATFED 
LQIRPHALTVHSYRAP 


219 


1569 


A 


2489 


3 


428 


SSRLVLLAGAAALASGSQGDREPVYRDCVLQ 
CEEQNCSGGALNHFRSRQPIYMSLAGWTCRD 
DCKYECMWVTVGLYLQEGHKVPQFHGKWP 
FSRFLFFQEPASAVASFLNGLASLVMLCRYRT 

TTVPACQPIi/tVHTrVdPAWV^ 

r VrAjarlvl T Jrll UVArAW v a 


220 


1570 


A 


2498 


1 


1297 


MDGEAVRFCTDNQCVSLHPQEVDSVAMAPA 

APK1PRLVQATPAFMAVTI.VFSLVTLFWDH 

miFGREAEMRELlQTFKGHMENSSAWVVEIQ 

MLKCRVDNVNSQLQVLGDHLGNTNADIQMV 

KG VLKD ATTL SLQTQMLRS SLEGTN AEIQRL 

KEDLEKADALTFQTLNFLKSSLENTSIELHVL 

SRGLENANSEIQMLNASLETANTQAQLANSS 

LKNANAEryVLRGHLDSVNDLRTQNQVLRNS 

LEG AN AEIQGLKENLQNTNALN SQTQAFIKS S 
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FDNTSAEIQFLRGHLERAGDEIHVLKRDLKM 
VTAQTQKANGRLDOTDTQIQVFKSEMENVN 
TLNAQIQVLNGHMKNASREIQTLKQGMKNA 
SALTSQTQMLDSNLQKASAEIQRLRGDLENT 
KALTMEIQQEQSRLKTLHWITSQEQLQRTQ 


221 


157] 


A 


2501 


3 


500 


RVRLNNDGLSPLMMAAKTGKJGIFQHDRREV 

TDEDTRJILSRKFKDWAYGPVYSSLYDLSSLD 

TCGEEASVLEILVYNSKIENRHEMLAVEPINE 

LLRDKWRKPGAVSFYTNVVSYLCAMVIFTLT 

AYYQPLEGTPPYPYRTTVDYLRLAGEVITLFT 

GVLFFFTN 


222 


1572 


A 


2508 


3 


395 


DAHCQRKLAMQEFMErNERLTELHTQKQKL 

ARHVRDKEEEVDLVMQKVESLRQELRRTER 

AKKELEV1ITEALAAEASKDRXLREQSEHYSK 

QLENELEGLKQKQISYSPGVCSIEHQQErnCL 

KTDLEKKS 


223 


1573 


A 


2544 


2 


412 


NDP AI I SNFS AAVVHTI VNETLESMTSLE VTK 

MVDERTDYLTKSLKEKTPPFSHCDQAVLQCS 

EASSNKDMFADRLSKSIIKHSIDKSKSVTPN7D 

KNAVYKESLPVSGEESQLTPEKSPKFPDSQNQ 

LTHCSLSAA 


224 


1574 


A 


2552 


401 


I 


GASLCFISTAFTVLTFLIDSCRFSYPERPIIFLSM 

CYNIYSlAYrVRLTVGRERISCDFEEAAEPVLI 

QEGLKNTGCAIIFLLMYFFGMASSIWWVJLTL 

TWFLAAGLKWGHEA1EMHSSYFH1AAWAIPA 

VK 


225 


1575 


A 


2563 


724 


1 


MSARKJERREKGEEEGEGEKDGDEDEKEEEKE 
GLGEEEEKEAGKKKKKQEEKEKEKGAVY SR 
VARICKNDMGGSQRVLEKHWTSFLKARLNC 
SVPGDSFFYFDVLQSITDUQINGIPTVVGVFTT 
QLNSIPGSAVCAFSMDDIEKVFKGRFKEQKTP 
DS V \VTA VPEDK VPKPRPGCCAKHGL AEAYK 
TS1DFPDETLSF1KSHPLMDSAVPPIADEPWFT 
KTRVRYRLTAISVDHSAGPYH 


226 


1576 


A 


2571 


449 


3 


EGVLFVYGNYVGDVMNFEMAAEMAQEVAIP 

TRTVLTTDDISSSPIEDRDGRRGVAGNFFIFKV 

AGAACDRGMSLEACFAVTRKAhTRRTYTMG 

VALEPCSLPQTRRHNFE1GAEEMEIGMGIHGE 

RGVIREKMMPADAIVDHIMDRIFS 


227 


1577 


A 


2575 


3 


1197 


VLSDLCLFYYRDEKEEGILGSILLPSFQIALLTS 

EDHINRKYAFKAAHFNMRTYYFCTDTGKEM 

ELWMKAMLDAALVQTEPVKRVDKTTSENAP 

TKFJTNN1PNHRVLIKPE1QNNQKNKEMSKIEE 

KKAL EAEK YGFQKDGQDRPLTKJNS VKLNSL 

PSEYESGSACPAQTVHYRPINLSSSHNKIVNVS 

LADLRGGNRPNTGPLYTEADRVIQRTNSMQQ 

LEQWIKIQKGRGHEEETRGVISYQTLPRNMPS 

HRAQIMARYPEGYRTLPRNSKTRPESICSVTP 

STHDKTLGPGAEEKRRSMRDDTMWQLYEW 

QQRQFYNKQSTLPRHSTLSSPKTMVNISDQT 

MHSIPTSPSHGSIAAYQGYSPQRTYRSEVSSPI 

QRGDVTIDRRHRAHHPKVK 


228 


1578 


A 


2583 


3 


330 


LPFLGLGSVLPQGMVMASPEMNPTICSVFEA 
HIVLLFHATTFRRGFQVTVLVGNVRQTAVVE 
KIHAKVRGTWPFISPEVRKEGGLPQTGRELLD 
PTMGIKPHLWWVAA 


229 


1579 


A 


2589 


1 


448 


DDKNAQGIKRHVKPTSGNAFTICKYPCGKSR 
ECVAPNICKCKPGY1GSNCQTALCDPDCKNH 
GKCIKPNICQCLPGHGGATCDEEHCNPPCQH 
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GGTCLAGNLCTCPYGF VGPRCETM V CN RHC 
h.NOGVt'H FUlCyCK_rvj W Y bF I US I A 


230 


1580 


A 


2593 


2 


138 


A VTFS WF A Y V ADITQEHERSMA YG L VCMFI 
LYLLYLLRNAFFLR 


231 


1581 


A 


2595 


185 


2 


SGPYTDFTPWPTEEQKLLEQALKTYPVNPPER 
WEKIAEAVPGRTKKACIKRYKVADLR1SK 


232 


1582 


A 


2596 


1 


391 


STVTGQPRRLLDTAGHQQPFLELKIRANCPGA 

GRARRRTPTCEPATPLCCRRDHYVNFQELGW 

RDWILLPEGYQLNYCSGQCPTHLAGSPGIAAS 

FHSAVFSLLKANNPWPGRTSWCVPTARRPLS 

LLYL 


233 


1583 


A 


2601 


184 


403 


LLFSDEIIMAAPLRIADVTSGLIGGEDGRVYV 

YNGKETTLGDMTGKCKSWITPCPEEKVNVLQ 

NSIPYWERIT 


234 


1584 


A 


2614 


178 


335 


PLTL CLPENNKPPQADA VPDKELTLPVDSTTL 
DGSKSSDDQKIISYLWEKTQ 


235 


1585 


A 


2616 


2 


896 


DVLEVYGTGVASTRHEMGTLDKHKELEDLV 

AKFLNVEAAMVFGMGFATNSMNIPALVGKG 

CLILRDEVNHTSLVLGARLLGATIGIFKHNYA 

QSLEKLLRDAVIYGQPRTRRAWKKILILVEGV 

YSMEGSIVHLPQIIALKJCKYKAYLYIDEAHSI 

GAVGPTGRGVTEFTGLDPHEVDVLMGTFTKS 

FGASGGY1AGRKARILSPPACLVPNTGSHSLH 

RLTRDLQMNEAMVALVTDRLQGWNSGEGN 

WDRADKFGDLVDYLRVHSHSAVYASSMSPPI 

AEQ 1IRSLKL I MGLDGTTQ 


236 


1586 


A 


2621 


1 


392 


NTS SFP AQPSSPARPSLPHLS QHPSNPLLPL AS 
ADHPQCGRFLPLHEPEPLCPSPSLSYPTLVSS 
WSSPFS SHHGCPPG L YPFPTSPKTIQPPG L AQL 
KMLCIPPGRQQLRGAQSMPGHGALSPLLLPP 
A 


237 


1587 


A 


2628 


398 


1 


DLVCKISGFGRGPRDRSEAVYTTMSGRSPAE 

WAAPETLQFGHFSSASDVWSFGIIMWEVMAF 

GERPYWDMSGQDVIKAVEDGFRLPPPRNCPN 

LMHRLMLDCWQKDPGERPRFSQIHSILSKMV 

QDPEPPNV 


238 


1588 


A 


2631 


1 


1104 


WSPCSLTCGVGLQTRDVFCSHLLSREMNETV 

ILADELCRQPKPSTVQACNRFNCPPAWYPAQ 

WQPCSRTCGGGVQKREVLCKQRMADGSFLE 

LPETFCSASKPACQQACKKDDCTSEWLLSDW 

TECSTSCGEGTQTRSAICRKMLKTGLSTWNS 

TLCPPLPFSSS1RPCMLATCARPGRPSTKHSPHI 

AAARKVYIQTRRQRKLHFVGGGFAYLLPKTA 

VVLRCPARRVRKPLITWEFOXjQHLISSTHVT 

VAPFGYLKJHRJLKPSDAGVYTCSAGPAREHF 

VIKLIGGNRKLVARPLSPRSEEEVLAGRKGGP 

KEAJLQTHKHQNGIFSNGSKAEKRGLAANPOS 

RYDDLVSRJLLEQGAPCSSSKKKN 


239 


1589 


A 


2636 


1 


678 


MKPDNILLDEHGHVHn-DFNIAAMLPRETQIT 
TMAGTKPYMAPEMFS SRKG AG Y SF A VDWW 
SLGVTA YELLRGRRPYI URSSTSSKEIVHTFET 
TVVTYPSAWSQEMVSLLKKLLEPNPDQRFSQ 
LSDVQNFPYMNDINWDAVFQKRLIPGFIPNK 
GRLNCDPTFELEEMILESKPLHKKKKRLAKK 
EKDMRKCDSSQTCLLQEHLDSVQKEFHINRE 
KVNRDCI 


240 


1590 


A 


2639 


389 


3 


ELLDPTTPMRTKCIELL YAALTSS STDQPKAD 
LWQNFAREIEEHVFTLYSKNIKKYKTCIRSKV 
ANI.KNPRNSHLQQNLLSCTTSPREFAEMTVM 
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EMANKELKQLRASYTESCIQEHYLPQVIDGTL 
Y 


241 


1591 


A 


2640 


392 


3 


IRLTILRCVFMRI^TICVLVFTLGSKJTSCDDD 

TCDLCGYNQKLYPCWETQVGQEMYKLMIFD 

FIIILAVTLFVDFPRKLLVTYCSSCKLIQCWGQ 

QEFAIPDNVLGIVYGQTICWIGAFFSPLLPAM 

Y 


242 


1592 


A 


2642 


405 


1 


YFKNTTLLL VG VIC VAAA VEK WNLHKRJ AL R 
MVLMAGAKPGMLLLCFMCCTTLLSMWLSNT 
STTAMVMPrVEAVLQELVSAEDEQLVAGNSN 
TEEAEPISLDVKNSQPSVEL1FVNEDILDFLMK 
SPLMISQACI 


243 


1593 


A 


2646 


412 


2 


CL AMIK.GIQS SGKJ I YF S SLFPYV VLIC FLIRAF 
LLNGSIDGIRHMFTPKLEIMLEPKVWREAATQ 
VFF ALGLGFGG VTAFS S YNKRDNNCHFDA VL 
VSFINFF1-SVLA1LVVFAVLGFKANV1NEKCIT 
QNSETV 


244 


1594 


A 


2650 


1 


1271 


MTTTLIGLLKTARLLRLVRVARKLDRYSEYG 

AAVLMLLMCIFALIAHWLACIWYA1GNVERP 

YLTDKIGWLDSLGQQIGKRYNDSDSSSGPSIK 

DKYVTALYFTFSSLTSVGFGNVSPNTNSEKIF 

SICVMLIGSLMYASIFGNVSAIIQRLYSGTARY 

HMQMLRVKEF1RFHQIPNPLRQRLEEYFQHA 

WTYTNGIDMNMVTNGTCSSCTSDDGHFILVS 

NHHQGGLIYSWNDAASMQRPFNH1KSSLLGS 

TSDSNT.NKYSTPNKIPQLTLNFSEVKTEKKNSS 

PPSSDKTIIAPKVKDRTHNVTEKVTQVLSLGA 

DVLPEYKLQAPRINKFTILHYSPFKAVWDWLI 

L LL V rYT AIFTP Y S AAFLLND REEQKRR EC G Y 

SCSPLNVVDLIVDIMFIIDILINFRTTYVNQNEE 

WSDPASV 


245 


1595 


A 


2656 


385 


2 


NLT WAVPLFRDVSFY I VDLIN1LI IFFLDNVIMW 
WESLLLLTAYFCYWFMKFNVQVEKWVKQ 
MINRNKVVKVTAPEAQAKPSAARDKDEPTLP 
AKPRLQRGGSSASLHNSLMRNSIFQNKJHTLD 

pirv 


246 


1596 


A 


2660 


200 


506 


VLVLQMNYYQMLIIYYVLFFKVNEFLAFEGPI 
LLDMRDCHLIKTNQLSQATALAKLCSDHPEIG 
1KGSFKQ I YLVCLCTSSPNGKLlfctVSMFSFIb 
NYFLS 


247 


1597 


A 


2678 


3 


267 


DAWVKNDIIFNQTERKQKISENLKHLASVRV 
VQKNL VF VVGLSQRLADPEVSPL VFFVILI FF 
VSLSYLEirFDPAQLCDSSEHTIS 


248 


1598 


A 


2687 


1 


— 77w 

404 


DPI 1LAAMMRTLFSLFGDVRSDVHRFSVTLF 

GAAIKSVKNPDKKSIENQVLDSLVPLLLYSQD 

ENDAVAEESRQVLTICAQFLKWKLPREVYSK 

DPWHIKPTEAOTICRFFEKKCKGKINILEQTL 

MYSKNPKL 


249 


1599 


A 


2692 


1 


440 


FRRRRRRRERDCAAQGARRHCRtiLAECKLV 

cmT/^rwin ovrx/o tut T*n axjxtct i/ci toi/ 
brPlOl YJSkVJL.KNVbOvl"^'* 1 LATHNbLK.&Ll ^K. 

FMTTFSQLRELHLEGNFLHRLPSEVSALQHLK 
AIDL SRNQFQDFPEQLTALPALETINLEENErV 
DVPVEKLAAMPALRSINL 


250 


1600 


A 


2693 


459 


21 


LLPGSLGVPILHSQPWDPSPQCPHRAPSTPRRL 
PPLG ALSQ ALTFLS RAAKNHSQ D PGXGTKPFP 
AAPAAPPPRSSLPAPLPMGLKDKGPQPAPPTIF 
NSPWHPATLPGALGPQLSQAAPSPIPPPCLMG 
ISSCPDLKLTKSSTP 


251 


1601 


A 


2694 


2 


404 


FVFDLKLRVPGFAALL1HGASSVPGPETVRLR 
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QKi^KiCKAPDHSSGRKEELVTTHTVDKLETKK 
PVGRVLCGLSGELLHSLLLPRRKTEKRALGSH 
RKAGFPEHPVAPEPLSNSCQISKEGREQVLSEI 
GAGDCL 


252 


1602 


A 


2697 


421 


1 


PQKSHSGAYQCFATRKAQTAQDFAIIALEDG 
TFRJ V S SFSEK V VNPGEQFSLMC AAKG APPPT 
VT WALDDEPr VRDGS HRTNQ YTMSDGTTI SH 
MNVTGPQ1RDGGVYRC1ARNLVGSAEYQARI 
NVRGPPSIRAMRNIT 


253 


1603 


A 


2698 


65 


401 


ACCQWRRTLIPAKSTTVSCTISTPHHPFRGSYS 
FDDHITDSEALSRSSHVFTSHPRMLKRQPAIEL 
PLGGEYSSDVPRPLSTQLSSSLLGYFSTLMTG 
AAFTNN1ASSTIIL 


254 


1604 


A 


2699 


438 


301 


GQIHSQDDPPFIDQLGFGVAPGFQTFVACQEQ 
RVRGPWEAGPGVGY 


255 


1605 


A 


2700 


1 


842 


LQNRHDSSEG1RKKLVEAEELEEKHREAQVS 

AQHLEVHLKQKEQHYEEKTKVLDNQIKKDLA 

DKETLENMMQRHEEEAHEKGKILSEQKAMIN 

AMDSKIRSLEQRJVELSEANKLAANSSLFTQR 

NMKAQEEM1 SELRQQKFYLETQ AGKLEAQN 

RKLEEQLEKISHQDHSDKNRLLELETRLREVS 

LEHEEQKLELKRQLTELQLSLQERESQLTALQ 

AARAALESQLRQAKTELEETTAEAEEEIQALT 

VGLGSNIFRLLKASARMSVELALSILAHP 


256 


1606 


A 


2701 


2 


405 


FVGGPGADPPVAVMWDPRAARMDLTAYAE 
LL KES GN Q VLKN GNFSL AIRK YD E AI QIL LQ L 
YQWGVPPRDLAVLLCNKSNAFFSLGKWNEA 
FVAAKECLQWDPTYVKGYYRAGYSLLRLHQ 
PYEAARMFFEGLR 


257 


1607 


A 


2702 


2 


399 


FVES AS SRPPGCFSGDGRFWL VS EGSRRG WD 
FWSFSFLDPRYSVGGDFJ^GTVTTLANILREF 
NPSLKGFSVGTGKETSFNAFLNQAVAGGRAE 
DLPVQARRLVDLMKNDTRIHFQEDWKIITLFI 
GGNDL 


258 


1608 


A 


2709 


1 


1097 


SVGARQGEARDRIRRFFPKGDLEVLQAQVER1 

MTRJCELLT^YSSEDGSEEFETIVLKALVKACG 

S SEASAYLDELRL AVA WNRVDI AQSELFR GDI 

QWRSFHLEASLMDALLKDRPEFVRLLISHGLS 

LGHFLIPMRLAQLYSAAPSNSLERNLLDQASH 

SAGTKAPALKGGAAELRPPDVGHVLRMLLG 

KMCAPRYPSGGA WDPHPGQGFGESMYLL SD 

KATSPLSLDAGLGQAPWSDLLLWALLLNRA 

QMAMYFWEMGSNA VSS ALG ACT XLRVMAR 

LI^DAEEAARRKDLAFKFEGMGVDLFGECYR 

SSEVRAARLLLRRCPLWGDATCLQLAMQAD 

ARAFFAQDGVQSLPTQKWWGDMARR 


259 


1609 


A 


2721 


1 


403 


VYLGAGPGLFFSNEGAKEGEKANIPKLMLPR 
GGFSQREMVTGERSPSPEEEEEEEEEGFGERA 
SCRRGLFRVRLTRVGLAAPSKASRGQEGDAA 
rlvbrV XcJvo rFJf Kr rKV oi^or iWVK !>Oo KjUK^C, a 

GGLRVRLP 


260 


1610 


A 


2728 


1 


477 


LLGGDLRYHLQ^NVHFTEGTVKLYICELALA 
LEYLQRYHIII rRDIKPDNILLDEHGHVHITDFN 
IATVVKGAERASSMAGTKPYMAPEVFQVYM 
DRGPGYSYPVDWWSLGITAYELLRGWRPYEI 
HSVTPIDEILNMFKVERVHYSSTWCKGMVAL 
LRK 


261 


1611 


A 


2730 


3 


547 


LTITDF1L VL YR YYRS PL VQIYEIEQUKIETWR 
EIYLQGCFKPLVSISPNDSUTAVYTUKNRIH 
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RLPVLDPVSGNVLHILTHKRLLKFLHrFGSLLP 
RPSFLYRTIQDLGIGTFRDLAVVLETAPILTAL 
D1FVDRRVSALAWNECGTHPQDERLGLGW 
GLGEPGSEERLFPAAITSR 


262 


1612 


A 


2733 


3 


431 


GPEFPGSAKLVFLDLSYNNLTQLGAGAFRSA 
GRLVKLSLANNNLVGVHEDAFETLESLQVLE 
LNDNNLRSLSVAALAALPALRSLRLDGNPWL 
CDCDFAHLFS WIQENA SKLPKGLDEIQCSLPM 
ESRRISLRACRRPASRV 


263 


1613 


A 


2736 


2 


343 


PARISGVDPFVRKATKGGENCSFEDNKNWQF 
LWGLNGNFNFFKFJ'WGGRNNHAKGFRTTW 
ARSSSQNNRTFQNNRNFLRLQRDSQKKGQFA 
RLI SPLVNLPQS PGGLEFQYQ AT 


264 


1614 


A 


2738 


2 


245 


RAMLKCLREGQPPPSYNWTRJ.DGPLPSGVRV 
DGDTLGFPPLTTEHSGIYVRHDTNEFSSRDSH 
DTVDVLDPPEDSGKQVDL 


265 


1615 


A 


2752 


2 


388 


AAGDAPLRSLEQANRTRFPFFSDVKGDHRLV 
LAAVETTVLVLIFAVSLLGNVCALVLVARRR 
RRGATACLVLNLFCADLLFISAIPLVLAVRWT 
EA WL LGPVACHLLFYVMTLS GS VTILTL AA V 
SLER 


266 


1616 


A 


2755 


192 


1 


AFREVGG YW G LLCEHL YAIPSKTSEGNWT AK 

LQGYLPLQDAFH1FQDPLTGDLPWPELILGLP 

V 


267 


1617 


A 


2760 


434 


714 


ASRLEKQNSTPESDYDNTPNDMEPDGMGYM 
HRTSVPGEGLPRARDLAGLGQQKQFTTHTPF 
L YFQTHKGLKDS SIRSEVTCLGISQC WRKGFF 


268 


1618 


A 


2762 


1 


405 


IACTFCGQDEWSPERSTRCFRRRSRFLAWGEP 

AVLLLLLLLSLALGLVLAALGLFVHHRDSPL 

VQASGGPLACFGLVCLGLVCLSVLLFPGQPSP 

ARCLAQQPLSHLPLTGCLSTLFLQAAEIFVESE 

LPLSWAJE 


269 


1619 


A 


2772 


3 


243 


TRPAEKIQYLVLFFVMSHPSQAYDKLSLSDHL 
LIAVLNLLRREVSEHGRHLQQYFKLFVMYAN 
LSKNLSFSEFCFDVSY 


270 


1620 


A 


2789 


1 


486 


ELQSQQACTHTKETEQLRSQLQTLKQQHQQA 
VEQIAKAEETHSSL SQELQ ARLQTVTREKEEL 
LQLS IERGK VLQNKQAEICQLEEKLEIANEDR 
KHALERFEQEA VA VD SNLRVRELQRKVDGIQ 
KAYDELRJLQSEAFKKHSLDLLSKERELNGKL 
RHLSP 


271 


1621 


A 


2795 


1 


568 


KEKRVTVQUT-ESIQKNQEDKLKMVPRKQRE 
FSGSDRGKLPGSEEKNQGPSM1GRKEERLITE 
RKHEHLKNKS APKV VKQKVID AHLDS QTQN 
FQQTQIQTAESKAEHKKLPQPYNSLQEEKCLE 
VKG IQEKQVF SNTKDSKQE1TQNKSFFS S VKE 
SQRDDGKGALTsTIVEFLRKREELHQILSTVKQP 


272 


1622 


A 


2797 


8 


523 


KCMQGKYAGAMESEPCVCTEADFDCDYGYE 

RHSNGQCLPAFWFKPSSLSKDCSLGQSYLNST 

GYRKVVSNNCTDGVREQYTAKPQKCPGKAP 

RGLRIVTADGKLTAEQGHNVTLMVQLEEGD 

VQRT1JQVDFGDGIAVSYVNLSSMEDGIXHV 

YQNXGIXRXTVQVDNSLGS 


273 


1623 


A 


2801 


72 


395 


HPSRSNVGPRQLTVWNTSNLSHDNRRKYTFS 
DEEGQNQLGIRIHQDIPLPPRRRELPALRTTNG 
KADSLNVSRNSVMQELSELEKQIQVTRQELQL 
A VSRKTEI ,EE YH 


274 


1624 


A 


2805 


168 


320 


rLWLYFETGTWVYPVFAKLSLLGLAALFSLRE 
IFIARNGWGETLTHCKRV 
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275 


1625 


A 


2812 


208 


321 


G SL ATCQLSEPLL WFILR VLDTSD ALKAFHD 
MGKIIFQ 


276 


1626 


A 


2813 


41 


266 


AGRSLHGAGDRAWVGISPTDWSPKWELCK 
KYQQQTW AIDLAGDETIPG S SLLPGH VQ A Y 
QVGPVRRNGEAGPG 


277 


1627 


A 


2817 


3 


410 


VLQERLDNFQRKCIQLASSTEGKVDKLLMRN 
LFISYLHTPKHKQHEVLQAMGSILGITGEEME 
PLFQEEHGTATRWMTGWLEGGSKSVPKTPL 
GLNQQP ALNG SFSELFVKFL KTESL SSTLPTX 
LPPHNSPGKJK 


278 


1628 


A 


2821 


238 


457 


GLSGPSCSCPHSPLPTIISRAQLETALKWRNYE 
VKLRIXLHLEELQMEHDIRHYDLESVPMTWD 
PVDQNPRLV 


279 


1629 


A 


2822 


342 


1 


PLIPANLPAHSNPLQPLPSLPHPFLPATHKFPT 
TPPTFSSVPPPLPSLSSILHHSPLHSELNPHLQS 
CRLPSRPSVSRELPPQSGPASSVPLAPTPLPDS 
VPSQRHPTXPPPAS 


280 


1630 


A 


2825 


307 


77 


PSMVWSYHWGVKQKRLALCVFSFEEGGRRK 

CGQYWPLEKDSRIRFGFLTVTNLTGAVGEPG 

VAFQCDGQRRREPTC 


281 


1631 


A 


2827 


81 


381 


KMGTAVWVPKEKEKRDKASQEGGDVLGAR 
QDCTPSLKSLVATGNLLDLEETAKAPLSTVSA 
NTTNMDEVPRPQALSGSSVVWVSGCVASRS 
VILSLTSG 


282 


1632 


A 


2830 


471 


160 


KXPXDKYELEPSPLTQYILERKSPHTCWQVFV 
TSSGKYNELGYPFGYLKASTTLTCVNLFVMP 
YNYPVLLPLLDDLFKVHKLKPNLKWRQAFDS 
YLKTLPPYYL 


283 


1633 


A 


2835 


462 


148 


VSPALSLTPTIFSYSPSPGLSPFTSSSCFSFNPEE 
MKHYLHSQACSVFNYHLSPRTFPRYPGLMVP 
PLQCQMHPEESTQFSIKLQPPPVGRKNRERVE 
SSEESAP 


284 


1634 


A 


2836 


2 


384 


KTLPRTLLDILADGTILKVGVGCSEDASKLLQ 
DYGLVVRGCLDLRYLAMRQRNNLLCNGLSL 
KSLAETVLNFPLDKSLLLRCSNWDAETLTED 
QVTYAARDAQISVALFLHLLGYPFSRNSPGEK 
KR 


285 


1635 


A 


2843 


20 


271 


PIRPYYSYSGLDRDCSWLPLAKAWLPDVMIL 
VCDRVSEDGINRQQAQEWCIKHGFELVELSP 
EELPEEDGKCLCVRRKYGTYI 


286 


1636 


A 


2845 


197 


278 


TAEDVLTVAYEHGVNLFDrAEVYAAGK 


287 


1637 


A 


2851 


2 


427 


FVAEVRREWAKYMEVHEKASFTNSELHRAM 
NLHVGNLRLLSGPLDQVRAALPTPALSPKDIC 
AVLQNLKR1LAKVQEMRDQRVSLEQQLRELI 
QKDDITGSLVTTDHSQMKKLFEEQLKKYDQL 
KVYLEQNLAAQDRVLCALT 


288 


1638 


A 


2859 


2 


469 


FVNLGILTCIECSGIHREMGAHISRIQSLELDK 

LGTSELLPAKNVGNNSFNDIMEANLPSPSPKP 

TPSSDMTVRKEYITAKYVDHRFSRKTCSTSSA 

KLNELLEAJKSRDLLALIQVYAEGVELMEPLL 

EPGQELAETALHLAVRTADQTSLHLVE 


289 


1639 


A 


2861 


2 


454 


FVASGGPATARMSDSQFFCVAEERSGHCAW 
DGNFLYVWGGYVSIEDNEVYLPNDEIWTYDI 
D SGLWRMHLMEGELPASM SGSCG ACINGKL 
YrFGGYDDKGYSNRLYFVNLRTRDETYIWEK 
rTDFEG QPPTPRDKL SC WVYKDRLI YFG 


290 


1640 


A 


2868 


1 


378 


FRQGQLYKVFLHGSQGQVYHSQQVGPPGSAI 
SPDLLLDSSGSHLYVLTAHQVDRIPVAACPQF 
PDCASCLQAQDPLCXjWCVLQGRCTRKGQCG 
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RAGQLNQWLWSYEEDSHCLHIQSLLPGHHPR 
QE 


291 


1641 


A 


2870 


1 


385 


FRYMPNNRQQLLRKRHIGNDIVTIVFQEPGAL 

PFTPKSIRSHFQHVFVIVKVHNPCTENVCYSV 

GVSRSKDVFPFGPPIPKGVTFPKSAVFKDFLL 

AKVINAENAAHKSEKFRAMATRTRQEYLICD 

LA 


292 


1642 


A 


2877 


3 


188 


RPTRPPPATTQSPESTMDTSLKKEKSAILDLYI 
PPPPAVPYSPRYVAVHCHGMLVSCWCHL 


293 


1643 


A 


2878 


1 


427 


REKEEEVEEEEDKVVKETEKEAEQEKEEDSL 
GAGTHPDAAIPSGERTCGSEGSRSVLDLVNYF 
LSPEKLTAENR Y YCES C ASLQD AEK WELSQ 
GPCYLILTXLRFSFDLRTMRRRKILDDVSIPLL 
LRLPLAGGRGQAYDL 


294 


1644 


A 


2879 


109 


245 


QLCCFO^QTTLrVYILSFIGMVIFTFTLDLRYI 
irvTVTGGVLG 


295 


1645 


A 


2880 


3 


320 


LASSQHGILNNLSLLFSICKTCIRTMDHHCPRA 
NNCVGEQNHRFFCALHCKSKHFCIEFTLNTNF 
FNCFLPGAEKSTIDAPFSLQPFLQDSKYNTALS 
LSESISQ 


296 


1646 


A 


2892 


209 


363 


SQYSHSLDYHLLQVTKNPFTLGDSSNPGQTE 
RLQEFSQKMDQVRGHWPVST 


297 


1647 


A 


2893 


8 


424 


SPXTLXLDTFILLGIQDNILVLILATPPFMAGG 

KLYSTMGRFLRDRKNPACREMAWLLANLA 

QGDSLAARAIAVQKGSIGHLLGFLEDSLAAT 

QIQQSQASLLHMHNPPFEPTSVDMMRRACRA 

LLALAKVDDNHSEF 


298 


1648 


A 


2894 


310 


445 


FW1YFPSFFMTGYLPLGFEFAVEITYPESEGTS 
SGLLNASAQVNL 


299 


1649 


A 


2898 


1 


492 


KIKAKNLTNYDLCSIFLGTSTLLVWVGVIRYL 
GYFQAYNVLILTMQASLPKVLRFCACAGMIY 
LGYTFCGWIVLGPYHDKFENLNTVAECLFSL 
VNGDDMFATFAQIQQKSILVWLFSRLYLYSF1 
SLFrYMILSLFLALITDSYDTIKKFQQNGFPETD 
LQEF 


300 


1650 


A 


2901 


1 


445 


PVWWNSLNGASEVTFSVHVKDGGSFPKTDST 
TVTVRFVNKADFPKVRAKEQTFMFPENQPVS 
SLVTTITGSSLRGEPMSYY1ASGNLGNTFQIDQ 
LTGQ VSISQPLDFEKIQKYW W IE ARDGG VPP 
FSSYEKLDrrVLDVNDNAPIF 


301 


1651 


A 


2902 


162 


433 


THFICLPLGYCFPLLDKDLQLPSGFNCNFDFLE 
EPCGWMYDHAKWLRTTWASSSSPNDRTFPG 
KPA V SEDMKELRP ACSTYFNPRFP YKL 


302 


1652 


A 


2909 


2 


412 


GPQMLCKKIYFIWVTRSQCQFEWLADIMQEV 

EENDHQDLVSVHTYVTQLAEKFDLRTTMLYl 

CERHFQKVLNRSLPTGLRSITHFGRPPFEPFFN 

SLQEVHPQVRXIGVFSCGPPGMTKNVEKACQ 

LVNRQDRAHFM 


303 


1653 


A 


2914 


291 


453 


KLNR WLCFFYS WSFGILL YEM VTLGAPP YPE 
VPPTSILEHLQRRKIMKRPSSCS 


304 


1654 


A 


2926 


179 


354 


PGVPSQALRKAESLKKCLSVMEAKVKAQTAP 
NKDVQREIADLGEVGAA SLPPSSGPGA 


365 


1655 


A 


2938 


135 


438 


GMGYUIAKGILHKDLKSKNVFYDNGKVVIT 
DFGLFSISGVLQAGRREDKLRIQNGWLCHLA 
PEIIRQLSPDTEEDKLPFSKHSDVFALGTIWYE 
LHAREWP 


306 


1656 


A 


2944 


2 


329 


VRWNSCVNCSCAFGNGASLSTSLGESSGCLW 
EIGKWI^CSLLSFPSPLAVLirrFCrVTVLGREA 
LTKGAL W A VFLLAG S ALLC AE VTG VI WRQPE 
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Amino acid sequence (A— Alanine C— Cysteine, 
D=Aspartic Acid, E=<3lutarnjc Acid, 
F=Phenylalantne, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagtne J P~ Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonme, V-Valine, W=Tryptophan, 
Y=Tyrosinc, X=Unknown, *=Stop codoa, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














SKTKLSFKVSSSA 


307 


1657 


A 


2950 


2 


411 


NYLCIAKN SAQS AMGKTRL WQVPP VIENGL 

PDLSTTEGSHAFLPCKARGSPEPNTTWDKDGQ 

PVSGAEGKFTIQPSGELLVKNLEGQDAGTYT 

CTAENAVGRARRRVHLTILVLFVTTTLPGDRS 

LRLGDRLWLR 


308 


1658 


A 


2951 


1 


407 


PTRPPRVRPDNEFDAESQRKRTTSVSKMERM 
DSSLPEEEEDEDKEAJNGSGNAENRERHSESS 
DWMKTVPSYNQTNSSMDFRNYMMRDETLEP 
LPKN WEMA YTDTGM1 YFIDHNTKTTT WLDP 
RLCKKAKAPEDC 


309 


1659 


A 


2954 


2 


179 


QDFLTLTLTEPTGLLYVGAREALFAFSMEALE 
LQGAVRGGAVGGSRACQRARPRGAVLG 


310 


1660 


A 


2959 


1 


419 


QDMMERAIIDTFVGHDWEPGSYVQMFPYPC 
YTRDDFLF VIEHMMPLCMVIS W VY S V AMTIQ 
HIVAEKEHRLKEVMKTMGLNNAVHWVAWFI 
TGFVQLSISVTALTAILKYGQVLMHSirWIIW 
LFLAVYAVATIMFCF 


311 


1661 


A 


2963 


3 


465 


MKPQMPGLGAPNGYGPGRGRAGVPGGPERR 
PWVPHLLPFSSPGYLGVMKAQKPGAGEGMK 
PQKPGLRGTLKPQKSGHGHENGPWPGPCNA 
RVAPMLLPRLPTPGVPSDKEGGWGLKSQPPS 
A VQNGKLPGHQPPNG YGPGAEP GFNGGLEPQ 

ki 


312 


1662 


A 


2967 


3 


405 


WLAQEWSPCTVTCGQGLRYRVVLCIDHRGM 

HTGGCSPKTKPHIKEECIVPTPCYKJPKEKLPV 

EAKLPWFKQAQELEEGAAVSEEPSFIPEAWS 

ACTVTCGVGTQVRIVRCQVLLSFSQSVADLP1 

DECEGPKPA 


313 


1663 


A 


2969 


2 


430 


WADNCRQGYLDALRFLERRGLTKEPVLWT 
LVSKEPPAPADGNWDAGCDQRRKGGLSLNW 
KVPHVQVKDVFNFEQLSPELEAALKKACTRD 
PSRWARFWHSGPGQVXTYLLLPCTLPFEYTYF 
RSRRL WWLPD VP ADL WWMQ 


314 


1664 


A 


2971 


422 


33 


LDXSHNALQRLRPGWLAPLFQLRALHLDHNE 
LDALGRGVFVNASGLRLLDLSSNTLRALGRH 
DLDGLGALEKLLLFNNRLVHLDEHAFHGLRA 
LSHLYLGCNELASFSFDHLHGLSATHLLTLDL 
SSNRM 


315 


1665 


A 


2973 


1 


525 


ITVSTHASGSPFGLEPQSGWLWVRAALDREA 

QELYILKVMAVSGSKAELGQQTGTATVRVSI 

LNQNEHSPRLSEDPTFLAVAENQPPGTSVGRV 

FATDRDSGPNGRLTYSLQQLSEDSKAFRIHPQ 

TGEVTTLQTLDREQQSSYQLLVQVQDGGSPP 

RSTTGTVHVAVLDLNDNT 


316 


1666 


A 


2978 


2 


400 


ELVVELV S AGKSGPERNTYEVQ WTGN VPKA 
GTDANVYLTIYGEEYGDTGERPLKJCSDKSNK 
FEQG QTDTFITYAIDLG ALTKIRIRHDNTGNR 
AGWFLDRIDITDMNNEITYYFPCQRWLAVEE 
DDGQLSRE 


317 


1667 


A 


2981 


3 


440 


VLNCQGRPTRPVRINGDGQEVLYLAESDNVR 

LGCPYVLDPDDYGPNGLDEEWMQVNSNPAH 

HRENVFLSYQDKRINHGSLPHLQHRVRFAAS 

DPSQYDASINLMNLQVSDTATYECRVKKTTM 

ATRKVIVTVQARPAVPMCWTEGQ 


318 


1668 


A 


2995 


119 


414 


LPEKEFPHRKSSSLKVTKCLFTEQPKPIIILRFA 
ENYD ARM .RrDIANTI .REQVQFXFNKTYGKQ 
RRTPGEGHVAA VDREV AGFPVP AJEGI SGETIH 


319 


1669 


A 


2999 


2 


332 


GFFAYTYGRLWVEDLHSGAQQHWSGHSAE1 
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D=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine» H=Histidine, 
I=Isoleucine, K=Lysine, L-Leucine, 
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Q=Glutamine, R=Arginine, Serine, 
T— Threonine, V— Valine, W—Tryptophan , 
Y-Tyrosinc, X— Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














STLALSHSAQVLASASGRSSTTAHCQIRVWD 
VSOGLCQHLIFPHSrrVLALAFSPDDRLLVTL 
GDHDGRTLALWGTGHL 


320 


1670 


A 


3000 


693 


322 


IDESTGLIITVNYLDYETKTSYMMNVSATDQA 
PPFNQGFCSVY1TLLNELDEAVQFSNASYEAA 
lLENLALGTEIVRVQAYSIDNLNQlTYRFDAY 
TSTQAKALFKIDAITVRGWGQGAPFFPI 


321 


1671 


A 


3001 


6 


383 


RIPRGKACXTVLGRSTGELEGF AS SRLPPQPC 
GWGQSSDLLSRIDLDELMKKDEPPLDFPDTLE 
GFEYAFNEKGQLRHIKTGEPFVFNYREHLHR 
WNQKRYEALGEIITKYVYELLEKDCNSKKVS 


322 


1672 


A 


3007 


192 


447 


ERVUKSLFPGRGDSQCACCPSSPVWVFLETGF 

LFPWLFLQVEVIKKAYMQGEVEFEDGENGK 

DGAASPRNVGHNIYILAHQLARH 


323 


1673 


A 


3019 


18 


245 


K£LLFYHLrVNNINFFNTRYAKIHIPIlASVSEH 
QPTTWVSFFFDLHILVCTFPAGLWFCIKNIND 
ERVFGKRGF 


324 


1674 


A 


3020 


523 


797 


LCYFSARYHQRKIFGILYIFTLSAJNRKEPNLFI 
YLFIFFEMESHSVTHAGVQRHNLNSLQPLPPG 
FKRFSCLCFLSSWNYRGAPPGPANF 


325 


1675 


A 


3022 


2 


156 


NDFLPLYFGWVLTKKSSETLRKAGQVFLEEL 
GNHKAFKKELROCRWQVGAL 


326 


1676 


A 


3023 


38 


172 


KMVRGSKKLISFFPGGPYGILAGRDPSKGLAT 
FCLNKEALKDEFE 


327 


1677 


A 


3027 


1 


385 


LTLEFLLLPAASELAHGKRLACCIVDHKLPEC 
GFYGLYDKILLFKHDPTSANLLQLVRSSGDIQ 
EGDLVEVVLSASATFEDFQDIPHALTVHSYRA 
PAFCDHCGEMX,FGLVRQGLKCDGCGLNYIIK 
RC 


328 


1678 


A 


3030 


13 


569 


rTRPTISCQRPGPGLAAGMLPYTVNFKVSART 
LTGALNAHNKAAVDWGWQGLIAYGCHSLV 
WID S ITAQTLQ VLEKHKAD VVKVK WAREN 
YHHNIGSPYCLRLASADVNGKIIVWDVAAGV 
AQCEI QEHAKPIQD VQWLWNQDASRDLLL AI 
HPPNYIVLWNADTGTKLWKKSYADNILSFSF 
D 


329 


1679 


A 


3038 


90 


744 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAED 

GN1EYKKLVNPSQYRFEHLVTQMKWRLQEG 

RGEAVYQIGVEDNGLLVGLAEEEMRASLKTL 

HRMAEKVGADITVLREREVDYDSDMPRKITE 

VLVRKVPDNQQFLDLRVAVLGNVDSGKSTL 

LGVLTQGELDNGRGRARLNLFRHLHEIQSGR 

TS SISFEILGFNSKGE VHGINGTQ WGQTXRMG 

W 


330 


1680 


A 


3040 


3 


397 


LCSTLLLLTIPS WVLSQITLKES GPTLMKPTET 

LTLTCTFSGFSLNTSGVGVAWIRQPPGKALE 

WLALIYWDDDKRYSPSLNDRLTIAKDTSRNQ 

WLTMTNMGPVDTATYYCAQFARGARGSN 

WFDPWGQ 


331 


1681 


A 


3043 


3 


1509 


AGIRHEAPPTTShHUlRRQIDRGVTHLNISGLK 

MPRG1AIDWVAGNVYWTDSGRDVIEVAQMK 

GE>n^KTLISGMIDEPHArVVDPLRGTMYWSD 

WGNHPKJETAAMDGTLRETLVQDN1QWPTG 

LAVDYHNERLYWADAKLSVIGSIRLNGTDPI 

VAADSKRCiLSHPFSIDVFEDYTYGVTYTNNRV 

FK1IIKFGHSPLVNLTGGLSHASDVVLYHQHK 

QPEVTNPCDRKKCEWLCLLSPSGPVCTCPNG 

KRLDNGTC VPVPS PTPPPD APRPGTCNL QCFN 

GGSCFLNARRQPKCRCQPRYTGDKCELDQC 
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Amino acid sequence (A~Alanine C=Cystcine, 
I>=Aspartic Acid, E-Glutamic Acid, 
F=PhenylaIanine J G=Glycine, H=Hisridine, 
I=Iso leucine, K^Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosinc, X— Unknown, *»Stop codon, 
/=possiblc nucleotide deletion, ^possible 
nucleotide insertion 














WEHCRNGGTCAASPSGMPTCRCPTGFTGPKC 

TQQVCAGYCANNSTCTVNQGNQPQCRCLPG 

FLGDRCQYRQCSGYCENFGTCQMAADGSRQ 

CRCTAYFEGSRCEVNKCSRCLEGACWNKQS 

GDVTCNCTDGRVAPSCLTCVGHCSNGGSCT 

MNSKMMPECQCPPHMTGPRCEEHVFSQQQP 

GHIASILIP 


332 


1682 


A 


3045 


3 


952 


TTTISNFHTQVNRTYCCGTYRAGPMRQISLVG 

AVDEEVGDYFPEFLDMLEESPFLKMTLPWGT 

LSSUaQCRSQSDDGPIMWVRPGEQMIPTAD 

MPKSPFKRRRSMNEIKNLQYLPRTSEPREVLF 

EDRTRAHADHVGQGFDWQSTAAVGVLKAV 

QFGEWSDQPR1TKDVICFHAEDFTDVVQRLQ 

LDLHEPPVSQCVQWVDEAKLNQMRREGIRY 

ARIQLCDNDIYFIPRNVIHQFKTVSAVCSLAW 

HIRLKQYHPWEATQNTESNSNMDCGLTGKR 

ELEVDSQCVRIKTESEEACTEIQLLTTASSSFP 

PASE 


333 


1683 


A 


3046 


497 


167 


SACSTGPELPGRATRSLTRPANQKGCDGDRL 
YYDGCAMIAMNGSVFAQGSQFSLDDVEVLT 
ATLDLED VRS YRA E ISSRNLA V S APVDTC VG 
CSSKTWKVAPFVRAWWRP 


334 


1684 


A 


3053 


37 


276 


VITDLEEQLNQLTEDNAELNNQNFYLSKQLD 
EASGANDEP/QLRSEVDHLRREITEREMQLTS 
QKQVRRVNKWRSLEDF 


335 


1685 


A 


3054 


2 


846 


WDAWGDWSDCSRTCGGGASYSLRRCLTGR 

NCEGQNIRYKTC SNHDCPPD AEDFRAQQC S A 

YNDVQYQGHYYEWLPRYNDPAAPCALKCH 

AQGQNLWELAPKVLDGTRCNTDSLDMCISG 

ICQAVGCDRQLGSNAKEDNCGVCAGDGSTC 

RLVRGQSKSHVSPEKREENVIAVPLGSRSVRI 

TVKGPAHLFrESKTLQGSKGEHSFNSPGVFW 

ENTTVEFQRGSERQTFKrPGPLMADFIFKTRY 

TAAKDSVVQFFFYQPISHQWRQTDFFPCTVT 

CGGG 


336 


1686 


A 


3058 


54 


347 


VVGKQEAGAHSDSCCLLHTPPRLTPAHSRKA 
LRNSRIVSQKDDVHVCIMCLRAIMNYQVSRG 
AWDWRLGSPACPHWGLHKLPRLWDPLSLYP 
VLCWGT 


337 


1687 


A 


3059 


2 


709 


ILTSLVELTRFETLTPRFSATVPPCWVEVQQE 

QQQRRHPQHLHQQHHGDAAQHTRTWKLQT 

DSNSWDEHVFELVLPKACMVGHVDFKFVLN 

SNTTNIPQIQVTLLKNKAPGLGKVNGLRLCPF 

LEDHKEDILCGPVWLASGLDLSGHAGMLTLT 

SPKLVKGMAGGKYRSFL1HVKAVNERGTEEI 

CNGGMRPWRLPSLKHQSNKGYSLASLXAK 

VAAGKEKSSNVKNENTSGTRK 


338 


1688 


A 


3060 


85 


384 


KAFYNYHVLELLQMLVTGGVSSQLEQHLDK 
DKVYGVADSCTSLLSGRNRCKLGLLSLHETIL 

pni/>mTi\ri 'L'AAt cppot t~\i tt it /*>i//^pt vr*TTr\t? 

SDVNPRN IruyLrCGbl-ljLrGlLCVuLYKlIUb 
EELNP 


339 


1689, 


A 


3063 


236 


362 


CFLCLSGDFMVMTIFFNVSRRFGYVAFQNYV 
PSSVTTMLSWV 


340 


1690 


A 


3065 


3 


1249 


DL WQFTPLHE AASKNRVEVC SLLLS YG ADPT 

LLNCHNKSAIDLAPTPQLKERLAYEFKGHSLL 

QAAREADVTRIKXHLSLEMVNFKHPQTHETA 

LHCAAASPYPKRKQICELLLRKGANINEKTKE 

FLTPLHVASEKAHNDWEVWKHEAKVNAL 

DNLGQTSLHRAAYCGHLQTCRLLLSYGCDPN 
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Amino acid sequence (A— Alanine C=Cysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F=Phenylalaninc, G=Glycine, H=Histidine, 
i=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V— Valine, W-Tryptophan, 
Y=Tyrosinc, X=Urt known, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














I1SLQGFTALQMGNENVQQLLQEGISLGNSEA 

DRQLLEAAKAGDVETVKKLCTVQSVNCRDIE 

GRQSTPLHFAAGYNRVSWEYLLQHGADVH 

AKDKGGL VPLHN AC S YGHYE V AELL VKHGA 

WNVADLWKFTPLHEAAAKGKYEICKLLLQ 

HGADPTKXNRDGNTPLDLVKDGDTDIQDLLR 

GDAALLDAAKKGCLARVKKLSSPDNVNCRD 

TQGRHSTPLHLAGK 


341 


1691 


A 


3070 


1 


547 


GVLIPSFQNQLFADILAGIESVTSEHNYQTLIA 

NYNYDRDSEEESVTNLLSYNIDGI1LSEKYHTI 

RTVKFLRSATIPWELMDVQGERLDMEVGFD 

NRQAAFDMVCTMLEKRVRHKILYLGSKDDT 

RDEQRYQjGYCDAMMLHNLSPLRMNPRAISSI 

HLRMQLMRDALSANPDLDGVFCTN 


342 


1692 


A 


3073 


463 


3 


RINRCRKPSDADILVPGDTISLIGTTSLRJDYNE 

IDDNRVTAEEVT>[LLREGEKLAPVMAKTRILR 

AYSGVRPLVASDDDPSGRNVSRGIVLLDHAE 

RDGLDGFITITGGKLMTYRLMAEWATDAVC 

RKLGNTRPCTTADLALPGSQEPAKVP 


343 


1693 


A 


3075 


250 


1 


LLIYLAIFAPVAMSALAGVKSVQQVRIRAAQS 
LGASRAQVLWFVILPGALPEILTGLRIGLGVG 
WSTLVAAELIAATRGLGFM 


344 


1694 


A 


3076 


2 


138 


LYFDAYLQSLQVAAISTFCCLLIGYPLAWAV 
AHSKPSTRNILLLL 


345 


1695 


A 


3078 


469 


3 


LKIRGQRIELGEIDRVMQALPDVEQAVTHAC 

VINQAAATGGDARQLVGYLVSQSGLPLDTSA 

LQAQLRETLPPHMVPWLLQLPQLPLIANGKL 

DRKALPLPELKAQAPGRAPKAGSETIIAAAFS 

SLLGCDVQDADADFFALGGHSLLAMKLAT 


346 


1696 


A 


3082 


404 


2 


QNITSKDLDVRLDPQTVPIELEQLVLSFNHMI 
ERJEDVFTRQSNFSADIAHEIRTPITNLITQTEI 
ALSQSRSQICELEDVLYSNLEELTRMAKMVSD 
MLFLAQADNNQL1PEKKMLNLAHE VGKVFD 
QFEALPE 


347 


1697 


A 


3084 


3 


340 


NELTFKEAEISKLYTKVHPAYRTLLEKRQALE 
DEKAKLNGRVTAMPKTQQEIVRLTRDVESGQ 
QVYMQLLNKEQELKITEASTVGDVRJVDPAIT 
QPGVLKPKKGLI3LGAI 


348 


1698 


A 


3086 


723 


10 


TQAMVWQQKACAEDDPQLSGRHWLHAATL 

YNIAAYPHLKGDDLAEQAQALSNRAYEEAA 

QRLPGTMRQMEFTVPGGAPITGFLHMPKGDG 

PFPTVLMCGGLDAMQTDYYSLYERYFAPRGI 

AMLTIDMPS VGFSSK WKJLTQDS S LLHQHVLKL 

ALPNVPWVDHTRVAAFGFRFGANVAVRLAY 

LESPRLKAVACLGPVVHTLLSGLKCQQQVPE 

MYLDVLASRLGMHDASTKSSTRENH 


349 


1699 


A 


3087 


2 


249 


RIRSSDPEITLAGTPLHAAYLIGMTIJCAGFSV 
GFG VAMSQALGPFSLRAGVAS STLGIAQVCG 
SSL Wl WL AA WGIG A WNM 


350 


1700 


A 


3099 


3 


424 


EAPE ATPQPSQPGPS SPISLS AEEENAEGEVSR 
ANTPDSDITEKTEDSSVPETPDNERKASISYFK 
NQRGIQYIDLSSDSEDWSFNCSNTVQEKTFN 
KDTVirVSEPSEDEESQGLPTMARRNDDISELE 
DLSGMEDLK 


351 


1701 


A 


3108 


2 


404 


1KKNHIIGYQLLHRRALFEKRTRLSDYALIFG 
MFGIVVMVIETEL S WGA YYKAPLYSL ALKCL 
ISLFTIILLGLTIVYHAREIQLFMANYGADDWR 
SALTYEPIFLn .1 .F, ALRGVTHATPCRVSLSLWD 
GLDLP 
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352 


1702 


A 


3110 


341 


2 


AQLAEVCPPQTLLTTNTSSISITAIAAE1KNPER 
VAGLHFFNPAPVMKLVEWSGLATAAEWE 
QLCELTLSWGKQPVRCHSTPGFIVNRVARPY 
YSE A WRALEE QV AAPE VI 


353 


1703 


A 


3111 


3 


188 


HFSLFRIAFAVFLTYMTVGLPLPVIPLFVHHEL 
GYGNTMVGIAVGIQFLATVLTRGYAGRLA 


354 


1704 


A 


3116 


367 


225 


WQLFI ILNGTFLNIGETDTESCVNGWVYDRSS 
FPFSNMTE VRGL VFLS 


355 


1705 


A 


3117 


101 


53 


VINLVYLISSPRPELKFVDKESEWMKFPDGF 

EKFSPPJLQLDEVDFYYDPKHVIFSRLSVSADL 

ESRJCVVGENGAGKSTMLKLLLGDL\APVRGI 

RHAHRNLK1GYFSQHHVGAAGT*TFSACGNL 

LGTQVFLGRPEEEYXRHQLGFGMGISGELGHA 

SSLPACLGGqKEAEVAFCSDGLLPCPNFLML\ 

DEPTN\HLGHGRAIEALGPCLQTISGVGVILVS 

HE*SALSRLVCRE\LWVC*GRSTSPF 


356 


1706 


A 


3121 


137 


466 


RGGRDWGEHNQRLEEHQARAWQGAMDAG 
AASREHARWQGTGLAPGTRVAVAPTCVQGL 
PQERSVCRPFFSSRWREGPVWALGAGAHGKP 
RWSGGVRCVVRGGRWFTPAPH 


357 


1707 


A 


3124 


1249 


229 


MLEAPGPSDGCELSNPSASRVSCAGQMLEVQ 

PGLYFGGAAAVAEPDHLREAGITAVLTVDSE 

EPSFKAGPG VEDL WRLFVPALDKPETDLLSI I 

LDRCVAFIGQARAEGRAVLVHCHAGVSRSV 

Al IT AFLMKTD QL PFEKA YEKLQ LLKPEAKMN 

EGFEWQLKLYQAMGYEVDTSSAIYKQYRLQ 

KVTEKYPELQNLPQELFAVDFTTVSQGLKDE 

VLYKCRKCRRSLFRSSSILDHREGSGPIAFAH 

KRMTPSSMLTTGRQAQCTSYFIEPVQWMESA 

LLGVMDGQLLCPKCSAKLGSFNWYGEQCSC 

GRWITPAFQIHKNRVDEMKILPVLGSQTGKI 


358 


1708 


A 


3127 


816 


139 


EVETLGPRTPGP/EAQSPTPGSCPGWQEPSPGP 

TPPP*LSGPGPQGAPVLGKLLPDPEETPAGKTP 

LGKHFWWGL\PVTSANFSPGAAA*FGGALSPP 

GGDL/GHMLLQGPPSPFRLQQQ * QTPPGSH SP 

PTANRETNPGPAAAADTRSCWGHKRSWRGW 

RGLAPWRLGFGSPGIP*PAPAGIP/GRPTWEGG 

KGAGGKPSETLTRSPPVWRGKRGSANGFLSW 

VQILQ 


359 


1709 


A 


3132 


3 


191 


HEHLLLLLLCVFLVKSQGVNDNEEGFFSARG 
HRPLDKKREDAPNLRPALADVrrV CDYRAQIA 
*AASTPKRAASIAHNAVSCR*AQIA 


360 


1710 


A 


3134 


I 


286 


REPPRPALLFF* DRVSLCCPG WNA VVQSQLT 
AAPTS Q VQ/SDSPTFPS S WDYRH VPEYPANFL 
« RQGFPMLPRLVSNS WAQTVHPPRPPKVLDL 

QA 


361 


1711 


A 


3135 


56 


1449 


PVPAPRVSPSARGAPGRPRLPGVRGPRHS/WA 

AD* RGSRM/PPRAPAPSPTGP/APGGKKVRGR 

VPEDPDAYEPRCSAL*V*PTHVTSPQFCDP*N 

GQIRSYFrVLLRGL>rETMLVK/PLCRREP/PEA 

GPORQSTPAVTRDHRQHEDPROAGRQWDAD 

PRPSAP/PAEVATGSRPGRHMWMRLCLAAQQ 

APGLPHRTSIRPGWRRLTF.PEAWARRHRRPW 

GQRGAVRPPPOGAAPPPSHQGRRTNTDPSAT 

PRLTVMSRCLAPDLKAPASGPRGWRRGMPQ 

SS/GALLWTPPPTPRGSHSPRPREAPLRAIHPA 

GPSKVSRAGASGRLPEVIYGWVTLFTPPEAGT 

F/LIPSPT*MSPALVIQPPVPPTQMGLRISGLPR 

QG * PSG AP W* LPGL AQL AFQC HLPHDE VGPP 
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RNQSPLGNDTLSSGLPMGPRRQVWPLARVG 
GHS SPREPQ VLKKPL WGQTDIAG VGSASLYP 
DNL 


362 


1712 


A 


3136 


1270 


274 


RVGMVLGTREVGDSTPPPSPPLYPFTGNEFVQ 

HNT WQLSRVYPSDLRTD S SNYNPQEL WNAG 

CQM/V*GGSRDWEEGVEEQQVGNKFSSDGR 

VGECSRKLLG*EMLSVDITSRYRAPSTYLLNS 

LKEGLEGLHGESCSSFLLGPSVAMNMQTAGL 

EMDICDGHFRQNGGCGYVLKPDFLRDIQSSF 

HPEKPISPFKAQTLLNQVISVQQLPKVDKTXE 

GSI VDPL VK VQIFG VRLDTARQETNYVENNG 

FNPYWGQTLCFRVLGPDFPMLRFGKMDYDW 

KSRNDLLGKTPCPGTCMQQGYRHIHLLSKDG 

ISLRPASIFVYICIQEGLEGDES 


363 


1713 


C 


3139 


60 


248 


MFAGSYGKSMFSFSKJKVLNCLPKWRYHFVIA 
PAMNESPLAPHLHQHLVFSVFQVLTILIGV** 


364 


1714 


A 


3140 


57 


418 


SAFKTLQLPAFSLYFDLGSLKLL1LRIHTSIVK 
NHKVESPRTMSPG* DPQ SFLQIPQPRPPQLRV 
GLTSGLIQHFHSPSSCQFPLLRGPPFPRQPPLGI 
SGASLCPVLSPPR*PLQPSSL 


365 


1715 


A 


3145 


122 


413 


LLPYPSLFVFLRQCHFVTvRLECNGWSAHCN 
LHLPGSSDSPASAS*VAGTTGVCHHTRLIF\VF 
LV*TGFHYVAQAGLELLTA*S\PPQLPKWGL 
QA 


366 


1716 


A 


3150 


247 


2 


VGEKLHDIRFGNDFDMTPK AQATKEK IDKLN 
FIK1KKLCIEGYY/NREPQNGRKIFANYVS\DK 
GLMATIYEELLKJLSNKLIQ 


367 


1717 


A 


3152 


3 


2367 


QKLKQNQPKRAHVEDGGSRSKQGNEQSKKT 

PI EK S D F AAATHPRAF YL SKPDETPN A WM S D 

SGTGLTYWKLEEKDMHHSLPETLEKTFISLSS 

TD VSPNQ VLTLDPTLHMKPKQQI SGIQPHGLP 

NALDDR1SFSPDSVLEPSMSSPSDIDSFSQASN 

VTSQLPGFPKYPSHTKASPVDSWKNQTFQNE 

SRTSSTFPSWTITSNDISVNTVDEENTVMVAS 

ASVSQSQLPGTANSVPEC1SLTSLEDPVTLSKIR 

QNLFCEFCHARHIADLRAYYESErNSLKQKLEA 

KEISGVEDWKJTNQILVDRCGQLDSALHEATS 

RVRTLENKNNLLEIEVNDLRERFSAASSASKI 

LQER1EEMRTSSKEKDNTIIRLKSRLQDLEEAF 

ENAYKLSDDKEAQLKQENKMFQDLLGEYES 

LGKEHRRVKDALNTTENKLLDAYTQISDLKR 

MISKLEAQVKQVEHENMLSLRHNSRIHVRPS 

RANTT ATSDVSRRKWLIPGAEYSIFTGQPLDT 

QDSNVDNQLEETCSLGHRSPLEKDS SP/GS S ST 

SLLIKKQRETSDTPIMRALKELDEGKIFKNWG 

TQTEKEDTSNSLL*/INPRQTETSVNASRSPEK 

CAQQRQKRLNSASQRSSSLPPSNRKSSTPTKR 

EIMLTPVTVAYSPKRSPKJENLSPGFSHLLSKN 

ESSPIREKTYSEKATDNHVNHSSCPEPVPNGV 

KK. VS VRTA WEKNK S VS YEQCKPVS VTPQGN 

DFEYTAKIRTLAETERFFDELTKEKDQIEAAL 

SRMPSPGGRJTLQTRLNQVKCLSLNLL 


368 


1718 


A 


3163 


2 


2350 


EFK.SGGCG AGL VAAGA VL VL YPASRAGERT 

RVPGSPAPSSLPLHSPGACGTEVDMDPQRSPL 

LEVKGNIELKRPLIKAPSQLPLSGSRLKRRPDQ 

MEDGLEPEKKRTRGLGATTKnTSHPRVPSLT 

TVPQTQGQTTAQKVSKKTGPRCSTAIATGLK 

NQKPVPAVPVQKSGTSGVPPMAGGKKPSKRP 

AWDLKGQLCDLNAELKRCRERTQTLDQENQ 
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QLQDQLRDAQQQVKALGTERTTLEGHLAKV 

QAQAEQGQQELKNLRACVLELEERLSrQEGL 

VQELQKKQVELQEERRGLMSQLEEKERRLQT 

SEAALSSSQAEVASLRQETVAQAALLTEREER 

LHGLEMERRRLHNQLQELKGNIRVFCRVRPV 

LPGEPTPPPGLLLFPSGPGGPSDPPTRLSLSRSD 

ERRGTLSG APAPPTRHDFSFDR VFPPG SGQDE 

VFEEIAML VQS ALDG YPVCIFA YGQTG SGKTF 

TMEGGPGODPQLEGLIPRALRHLFSVAQELSG 

QGWTYSFVASYVEIYNETVRDLLATGTRKGQ 

GGECEIRRAGPGSEELTVTNARYVPVSCEKEV 

DALLHLARQNRAVARTAQNERSSRSHSVFQL 

QISGEHSSRGLQCGAPLSLVDLAGSERLDPGL 

ALGPGERERLRETQAINSSLSTLGLVTMALSN 

KESHVPYRNSKLTYLLQNSLGG SAKMLMFV 

NI SPLEEN V SESLNSLRF ASKVEPS VLFGTAQS 

NRKWKTDPDLC VCVC VC VCVC VCVC V C VP 

MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


369 


1719 


A 


3165 


365 


12 


GYTSQGRWID1ERGPLTANTESLI IENNFNALP 
G YIRKIE* I * IYKKN* INFGGVGLLNrVKI S1L S/K 
lYRFDAIPVKILTRFFrNLDKLILKFVLKTKIAK 
NRIKTFYIMRRKKLGDSS 


370 


1720 


A . 


3170 


393 


42 


G ASISPS A V IDG VEG LKPMQEQEAQE AGPCLD 
* HMAPEQ W V APR\RLLF RLIFS VLHAL11 AAAA 
QSSAEEDEDPRN*GQSSEDQAPNQNGLIVrv'H 
RVHVPLGAAATVPVHRSHFPR 


371 


1721 


A 


3173 


770 


510 


GNGGCGLSQIPPSHLGAFSRGSLLSRG\DPRGP 

PPHPVIFFVFVVE\QGFTVLARMVS1S*PCDPP 

ALASQSAGITGVSHLARPQNLYF 


372 


L722 


A 


3180 


381 


76 


RVLIIIIDNVPAHSSPQKREISQEFQLEIRHLP*S 
PDLAPSGCFLFLNLKNIFKXGTHFSLVDNVKK 
TVSTWLH/SQNAQFYKDRLNGWYHCLQKCL 
QHY*AYVEK 


373 


1723 


A 


31S1 


410 


14101 


RREVAGPEGKGLLLASAHTMLTPPLLLLLPLL 

S AL VAAAID APKTCSPKQFACRDQITCI S K.GW 

RCDGERDCPDGSDEAPEICPQSKAQRCQPNE 

HNCL G TE LCVPMS RL CNG VQ DCMDG S D EG P 

HCRELQGNC SRLGCQHHC VPTLDGPTC YCN S 

SFQLQADGKTCKDFDECSVYGTCSQLCTNTD 

GSFICGCVEGYLLQPDNRSCKAKNEPVDRPP 

VLLIANSQNILATYLSGAQVSTITPTSTRQTTA 

MDFSYANETVCWVHVGDSAAQTQLKCARM 

PGLKGF\T>EHTINISLSLHHVEQMArDWLTGN 

FYFVDDIDDRIFVCNRNGDTCVTLLDLELYNP 

KG1ALDPAMGKVFFTDYGQIPKVERCDMDG 

QNRTXLVDSKTVFPHGrTLDLVSRLVYWADA 

YLDYTEVVDYEGKGRQTIIQGIL1EI 1LYGLTVF 

ENYLYATNSDNANAQQKTSVIRVNRFNSTEY 

QWTRVDKGGALHIYHQRRQPRVRSHACEN 

DQYGKPGGCSDICLXANSHKARTCRCRSGFS 

LGSDGKSCKKPEHELFLVYGK.GRPGIIRGMD 

MGAKVPDEHMIPIENLMNPRALDFHAETGFI 

YFADTTSYLIGRQKIDGTERETILKDGIHNVE 

GVAVDWMGDNLYWTDDGPKKT1SVARLEK 

AAQTRKTLIEGKMTHPRAIVVDPLNGWMYW 

TDWEEDPKDSRRGRLERAWMDGSHRDIFVT 

SKTVLWPNGLSLDIPAGRLYWVDAFYDRIETI 

LLNGTDRKIVYEGPELNHAFGLCHHGNYLFW 

T£YRSGSVYRLERGVGGAPPTVTLLRSE\RPPI 
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M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T— Threonine, V=V aline, W*=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possibJe 
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FEIR\MYDAQHQQVGSNKCRVNNAGCSSLCL 

ATPG SRQCAC AEDQ VLD ADGVTCLANPS Y VP 

PPQCQPGEFACANSRCIQERWKCDGDNDCLD 

NSDEAPALCHQHTCPSDRFKCENNRCIPNRW 

LCDGDNDCGNSEDESNATCSARTCPPNQFSC 

ASGRCIPISWTCDLDDDCGDRSDESASCAYPT 

CFPLTQFTCNNGRCTNrNWRCDNDNDCGDNS 

DEAGCSHSCSSTQFKCNSGRCIPEHWTCDGD 

NDCGDYSDETHANCTNQATRPPGGCHTDEF 

QCRLDGLCIPLRWRCDGDTDCMDSSDEKSCE 

GVTHVCDPSVKFGCKDSARCISKAWVCDGD 

NDCEDNSDEENCESLACRPPSHPCANNTSVC 

LPPDKLCDGND DCGDGSDEGELCDQCSLNN 

GGCSHNCSVAPGEGIVCSCPLGMELGPDNHT 

CQIQSYCAKHLKCSQKCDQNKFSVKCSCYEG 

WVLEPDGESCRSLDPFKFFIIFSNRHEIRRIDLH 

KGDY SVLVPGLRNTIALDFHL SQS ALY WTD V 

VEDKIYRGKLLDNGALTSFEWIQYGLATPEG 

LAVDWIAGNIYWVESNLDQIEVAKLDGTLRT 

TLLAGDffillPRAIALDPRDGILFWTDWDASLP 

RIEAASMSGAGRRTVHRETGSGGWPNGLTV 

D YLEKRILWID ARSD AJYS ARYDGS G HME VL 

RGHEFLSHPFAVTLYGGEVYWTDWRTNTLA 

KANKWGHNVTVVQRTNTQPFDLQVYHPSR 

QPMAPNPCEANGGQGPCSHLCLINYNRTVSC 

ACPHLMKLHKDNTTCYEFKKFLLYARQMEIR 

G VDIJD AP YYN Y II S FTVPD IDN VTVLD Y DARE 

QRVYWSDVRTQAIKRAFINGTGVETVVSADL 

PNAHGl^VDWVSRNLFWTSYDTNKKQINVA 

RLDGSFKNAWQGLEQPHGLWHPLRGKLY 

WTDGDNISMANMDGSKRTLLFSGQKGPVGL 

AIDFPESKLYWISSGNHTINRCNLDGSGLEVID 

AMRSQLGKATALAIMGDKLWWADQVSEKM 

GTCSKADGSGSVVLRNSTTLVMHMKVYDESI 

QLDHKGTNPCSVNNGDCSQLCLPTSETTRSC 

MCTAGYSLRSGQQACEGVGSFLLYS VHEGI R 

GIPLDPNDKSDALVPVSGTSLAVGIDFHAEND 

TrYWVDMGLSTISRAKRDQTWREDVVTNGIG 

RVEGIAVDWIAGNTYWTDQGFDVIEVARLNG 

SFRYWISQGLDKPRAITVHPEKGYLFWTEW 

GQYPRIERSRLDGTERVVLV1WSISWFNGISV 

DYQDGKL YWCD ARTDKIERIDLETGEN RE VV 

LS SNNMDMFS VS VFED FIYW S DRTHANG S IK 

RG SKDNATDS VPLRTGIG VQLKDIKVFNRDR 

QKGTNVC AVAN GGCQQLCL YRGRGQRACA 

CAHGMLAEDGASCREYAGYLLYSERTILKSI 

HLSDERNLNAPVQPFEDPEHMKNVIALAFDY 

RAGTSPGIPNRlFh"SDIHFGNIQQlNDDGSRRIT 

rVENVGSVEGLAYHRGWDTLYWTSYTTSTrr 

RHTVDQTRPGAFERETVITMSGDDHPRAFVL 

DECQhnLMFWTNWhTEQHPSlMRAALSGANVL 

TLrEKDIRTPNGLArDPIRAEKLYFSDATLDKIE 

RCEYDGSHRYVtLKSEPVHPFGLAVYGEHIF 

WTDWVRRAVQRANKHVGSNMKLLRVDIPQ 

QPMGIIAVANDTNSCELSPCRINNGGCQDLCL 

LTHQGHVNC S CRGGRILQDDLTCRA VN SSCR 

AQDEFEC ANGECINFSLTCDG VPHCKDKS DE 

KPS YCNSRRCKKTFRQC SNGRCVSNML WCN 

GADDCGDGSDEIPCNKTACGVGEFRCRDGTC 

IGNSSRCNQFVDCEDASDEMNCSATDCSSYF 
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RLGVKGVLFQPCERTSLCYAPSWVCDGAND 

CGDYSDERDCPGVKRPRCPLNYFACPSGRCB? 

MSWTCDKEDDCEHGEDETHCNKFCSEAQFE 

CQNHRCISKQWLCDGSDDCGDGSDEAAHCE 

GKTCGPSSFSCPGTHVCVPERWLCDGDKDCA 

DGADESIAAGCLYNSTCDDREFMCQNRQCIP 

KHFVCDHDRDCADGSDESPECEYPTCGPSEF 

RCANGRCLSSRQWECDGENDCHDQ SDEAPK 

NPHCTSPEHKCNASSQFLCSSGRCVAEALLCN 

GQDDCGDSSDERGCHINECLSRKLSGCSQDC 

EDLKIGFKCRCRPGFRLKDDGRTCADVDECS 

TTFPCSQRCINTHGSYKCLCVEGYAPRGGDP 

HSCKAVTDEEPFLIFANRYYLRKLNLDGSNY 

TLLKQGLNNAVALDFDYREQMIYWTDVTTQ 

GSMTRRMI ILNG SNVQVLHRTGLSNPDGLAV 

DWVGGNLYWCDKGRDTIEVSKLNGAYRTVL 

VSSGLREPRALWDVQNGYLYWTDWGDHSL 

IGRIGMDGSSRSVIVDTKITWPNGLTLDYVTE 

RIYWADAREDYIEFASLDGSNRHWLSQDIPH 

IFALTLFEDYVYWTDWETKSINRAHKTTGTN 

KTLL1STLHRPMDLHVFHALRQPDVPNHPCK 

VNNGGCSNLCLLSPGGGHKCACPTNFYLGSD 

GRTCVSNCTASQFVCKNDKCIPFWWKCDTE 

DDCGDHSDEPPDCPEFKCRPGQFQC STG ICTN 

PAFICDGDNDCQDNSDEANCDIHVCLPSQFK 

CTNTNRCIPGIFRCNGQDNCGDGEDERDCPE 

VTCAPNQFQCSITKRCIPRVWVCDRDNDCVD 

G SDEP ANCTQMTCG VDEFRCKDSGRCIPARW 

KCDGEDDCGDGSDEPKEECDERTCEPYQFRC 

KNNRCVPGRWQCDYDNDCGDNSDEESCTPR 

PCSESEFSCANGRC1AGRWK.CDGDHDCADGS 

DEKDCTPRCDMDQFQCKSGHCIPLRWRCDA 

DADCMDG SDEEACGTG VRTCPLDEFQCNNT 

LCKPLAWKCDGEDDCGDNSDENPEECARFV 

CPPNRPFRCKNDRVCL WIGRQCDGTDNCGD 

GTDEEDCEPPTAHTTHCKDKKEFLCRNQRCL 

S SSLRCNMFDDCGDGSDEEDCSIDPKLTSCAT 

NASICGDEARCVRTEKAAYCACRSGFHTVPG 

QPGCQDINECLRFGTCSQLCNNTKGGHLCSC 

ARNFMKTtlNTCKAEG SEYQVLYIADDNEIRS 

LFPGHPHSAYEQAFQGDESVRJDAMDVHVKA 

GRVYWTNWHTGTISYRSLPPAAPPTTSNRHR 

RQIDRGVTHLN1SGLKMPRGIAIDWVAGNVY 

WTDSGRD VIE VAQMKGENRKTLI SGMIDEPH 

AIVVDPLRGTMYWSDWGNHPKIETAAMDGT 

LRETLVQDNIQWPTGLAVDYHNERLYWADA 

KLSVIGSIRXNGTDPIVAADSKRGLSHPFSrDV 

FEDYTYGVTYINNRVFKIHKFGHSPLVNLTGG 

LSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPP 

PDAPRPGTCNLQCFNGGSCFLNARRQPKCRC 

QPRYTGDKCELDQCWEHCRNGGTCAASPSG 

MPTCRCPTGFTGPKCTQQVCAGYCANNSTCT 

VNQGNQPQCRCLPGFl ,GDRCQYRQCSGYCE 

NFGTCQMAADGSRQCRCTAYFEGSRCEVNK 

CSRCLEGACWNKQSGDVTCNCTDGRVAPS 

CLTCVGHCSNGGSCTMNSKMMPECQCPPHM 

TGPRCEEHVFSQQQPGHIASILIPLLLLLLXVL 

VAOVVFWYKRRVQGAKGFQHQRMTNGAM 

NVEIGNPTYKMYEGGEPDDVGGLLDADFAL | 
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DPDKPTN FTNP VYA TL YMGGHGSRHiJLASTD 
EKRELLGRGPEDErG DPLA 


374 


1724 


A 


3187 


191 


1815 


CLELASAGKIPEESKALSLLAPAPTMTSLMPG 

AGLLPIPTPNPLTTLGVSLSSLOAIPAAALDPNI 

ATLGEIPQPPLMGNVDPSKIDEIRRTVYVGNL 

NSQTTTADQLLEFFKQVGEVKFVRMAGDET 

QPTRFAFVEFADQNSVPRALAFNGVMFGDRP 

LKINHSNNAIVKPPEMTPQAAAKELEEVMKR 

VREAQSFISAAIEPG WLHSTSLCKDFLGCF* RR 

RMYRE* APCTICGTFHLCLIINWDL • LF* A YTA 

K+FFPPR VWKEQ* KKRRVRSRSHTRSK.SRS S SK 

SHSRRKRSQSKHRSRSHNRS RSRQ KDRRRSK 

SPHKKRSKSRERRKSRSRSHSRDKRKDTREKI 

KEKERVKEKDREKEREREKEREKEKERGKN 

KDRDKEREKDREKDKEKDREREREKEHEKD 

RDKEKEKEQDKEKEREKDRSKEIDEKRKKDK 

KSRTPPRSYNASRRSRSSSRERRRRRSRSSSRS 

PRTSKTIKRKS SRSPSPRS RNKKDKKREKERD 

HISERRERERSTSMRKSSNDRDGKEKLEKNST 

S 


375 


1725 


A 


3192 


415 


101 


AHSSHQTRAILQEFQWDIIRHPPLASPNLALSG 
FVFPNLKKSLRGTHFSSVKK\TTLTWLNSQDP 
WF/FFYP*SPDLQIPSSFRNGLNDWYHHSQKC 
PDLDGAYVKK 


376 


1726 


A 


3199 


931 


418 


GV*WCDLGSPQPPPPGFKQFCLGRSSSWDYR 
HVPPHPANFVFLLETGFLHAGQAGUGDPPAS 
ASQSAGITGVSHTWPKNHLIFYACLVIRSKRI 
K 


377 


1727 


A 


3201 


274 


J285 


KTGYTSRGSPLSPQSSIDSELSTSELEDDSISM 

GYKLQDLTDVQIMARLQEESLRQDYASTSAS 

VSRHSSSVSLSSGKKGTCSDQEYDQYSLEDEE 

EFDHLPPPQPRLPRCSPFQRGIPHSQTFSSIREC 

RRSPSSQYFPSNNYQQQQYYSPQAQTPDQQP 

NRTNGDK/PPK.KYA* PSPD AKYNCH* * QH\SSP 

VTVRNSQSFDSSLHGAGNGISRJQSCIPSPGQL 

QHRVHSVGHFPVSIRQPLKATAYVSPTVQGSS 

NMPLSNGLQLYSNTGIPTPNKAAASGIMGRS 

ALPRPSLAING SNLPRSKIAQPVRSFLQPPKPL 

SSLSTLRDGNWRDGCY 


378 


1728 


A 


3202 


112 


1789 


VPGVTF.SRPS VLRGDHLFAJLLSSET7 IQEDPIT 

YKGFVHKV\ELDRVKLSFSMSLLSRFVGWG* 

PFKVNFY/TFNRQPLRV\QHRALELTGRWLLW 

PMLFP\ V APRD VPLLPS D VKLKL YDRSLESNP 

EQLQAMRHrVTGTTRPAPYIIFGPPOTOKTVT 

LVEAJKQVVKHLPKAHILACAPSNSGADLLC 

QRLRVHLPSSIYRLLAPSRDIRMVPEDIKPCCN 

WDAKKGEYVFPAKJCKLQEYRVLITTLITAGR 

LVSAQFPIDHFTHIFIDEAGHCMEPESLVAJAG 

LMEVKETGDPGGQLVLAGDPRQLGPVLRSPL 

TrWUrtT rivCT T COT T TA/\JCT W"T^ ^lX>T\f^ vnDH 

1 QRJtIUHj I bLL.fc.ttLL 1 Y NoL YKK.Ur.UVj YDrQ 
FITKLLRNYRSHPTTLDIPNQLYYEGELQACA 
D WDRERFCR WAG\LPRQGFPIIFHG VMGKD 
EREGNSPSFFNPEE.AATVTSYLKLLLAPSSKK 
GKARLSPRS VG VI SP YRKQ VEKIRYCITKLDR 
ELRGLDDIKDLKYTCCSTVTPCLPCAPTCPLP 
ETSSSFHSSPRPRPTPAALNRARALPEPLTPGD 
SNLRVWDGIRKPACLTNTSCHS 


379 


1729 


A 


3206 


432 


130 


PKAAPSVXLWFPPFL*GSFKPTKGHTXCVXIK 
♦LSTREAXDSXPGRQIAXXRQGGKVETTTAL 



172 



Printed from Mimosa 03/03/06 11:10:17 Page: 173 



WO 01/57188 



PCT/U SO 1/03800 



SEQED 


SEQID 


Met 


SEQ 


Predicted 


Predicted end 


Amino acid sequence (A —Alanine O-Cystcine, 


NO: of 


NO: of 


hod 


ID NO: 


beginning 


nucleotide 


D=Aspar1ic Acid, E=Glutamic Acid, 


nucl- 


peptide 




in 


nucleotide 


location 


F=PhenyIalanine, G=Glycine, H=Histidine, 


eotide 


seq- 




USSN 


location 


corresponding 


I=Iso leucine, K= Lysine, L=Leucine, 


seq- 


uence 




09/496 


correspond! 


to last amino 


M=Methionine, N=Asparagine, P=Proline, 


uence 






914 


ng to first 
ami.no acid 
residue of 
peptide 
sequence 


acid residue 
of peptide 
sequence 


Q^Glutamine, R=Arginine, S=Serine, 
T-Threonine, V- Valine, W-Tryptophan, 
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XKQSNNKGTRASSYXEPDAXEQWKFPHKKL 














QLPGXTHE 


380 


1730 


A 


3207 


187 


507 


GGTGHPHPARPPLSGVGGCQCSHSKPWTAGS 
PEQRDHPAPHKQ1EAGQGLPGPQAWGG* KGP 
AXLLPGPGGGPGPVASLEARAQASSGVTPNG 
GGRTYPYPTFSSGE 


381 


1731 


A 


3225 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPS WGSF* GESL/ 

EMQUTSLGLQEFDIARNVLELIYAQTLVWIGI 

FFCPLLPFlQMrMLFIMFYSKNISLMMNFQPPS 

KAWRASQMMTFFIFLLFFPSFTGVLCTLAITI 

WRLKPSADCGPFRGLPLHHSIYSWIDTLSTRP 

GYLWVVWrYRNLlGSVHFFFILTLlVXirTYLY 

WQITECRKIMIRLLHEQirNEGKDKMFLIEKLl 

KLQDMEKKANPSSLVLERREVEQQGFLHLGE 

HDGSLDLRSRRSVQEGNPRA 


382 


1732 


A 


3238 


256 


38 


LLMJKVSSTCFSCHLHHHHHHHHRHHQGHNS 
LFFSLKSSSNSSTLPVYLSYNIILVFSKCLVFDF 
LFSNACL 


383 


1733 


A 


3241 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKAD 

KVTMLWNKKATAVLVIASTDVDKTGASYYG 

EQ fLH YIATNGESA V VQLPICNGPI YDV VWN S 

SSTEFCAVYGFMPAKATTFNLKCDPVFDFGTG 

PRN AA YYSPHGHIL VL AGFGNLrLQI* AD/IMK 

VWNVKNYKLISKPVASDSTYFAWCPDGEHIL 

T ATC APRL RVNNG YKI WHYTG SIL HK YD VP S 

N AEL WQV SWQPFLD GIFPAKTTTYQ A VPS E VP 

NEEPKVATAYRPPALRNKPITNSKLHEEEPPQ 

NMKPQSGNDKPLSKTALKKQRKHEAKKAAK 

QEARSDKSPDLAPTPAFQSTPRNTVSQSISGDP 

EIDKKIKNLKKKLKAIEQLKEQAATGKOLEK 

NQLEKIQKETALLQELEDLELGI 


384 


1734 


A 


3242 


3 


678 


IRSPAARSPGLETPTCLLFVIAAIAAVFVDSAIP 
RLTQHRPQDGSFPYTILDPPLYLPGQCAPPQP 
LSQCARRVHGEKLRRPTFGPRHRGAGTAKMS 
ASL VRAT VRA VSKRKL QPTRAALTLTPS A VN 
KIKQLLKDKPEHVG VKVG VRTRGCNGL S YTL 
EYTKTKGDSDEEV1QDGVRVFIEKJKAQLTLL 
GTEMDYVEDKLSSEFVFNNPNIKGTCGCGES 
FNI 


385 


1735 


A 


3243 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPL 

KEEEILPEPGSETPTVASEALAELLHGALLRR 

GPEMGYLPGPPLGPEGGEEH 1 Tri'IITTTTVTT 

TVTSPVLCNNNISEGEGYVESPDLGSPVSRTL 

GLLDCTYSIHVYPGYGIEIQVQTLNLSQEEELL 

VLAGGGSPGLAPRLLANSSMLGEGQVLRSPT 

NRLLLHFQSPRVPRGGGFRJHYQAYLLSCGFP 

PRPAHGDV S VTDLHPGGTATFHCD S GYQLQG 

EETLICLNGTRPSWNGETPSCMASCGGT1HNA 

TLGR1VSPEPGGAVGPNLTCRWVIEAAEGRRL 

HLHFERVSLDEDNDRLMVRSGGSPLSPVIYDS 

DMDD VPERGLISD AQ SL YVELLSETPANPLLL 

SLRFEAFEEDRCF APFLAi IGNVTTTDPE YRPG 

ALATFSCLPGYALEPPGPPNAIECVDPTEPHW 

NDTEPACKAMCGGELSEPAGVVLSPDWPQS 

YSPGQDCVWGVHVQEEKRTLLQVEILNVREG 

DMLTLFDGDGPSARVLAQLRGPQPRRRLLSS 

GPDLTLQFQAPPGPPNPGLGQGFVLHFKEVPR 

NDTCPELPPPEWGWRTASHGDLIRGTVLTYQ 

CEPGYELLGSDILTCQWDLSWSAAPPACQKJ 
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MTCADPGEI AN GHRTASDAGFPVGSHVQ YRC 

LPGYSLEGAAMLTCYSRDTGTPKWSDRVPKC 

ALKYEPCLNPGVPENGYQTLYKHHYQAGESL 

RFFCYEGFELIGEVnTCVPGHPSQWTSQPPLC 

KVTQTTDPSRQLEGGNLALAILLPLGLVIVLG 

SGVYIYYTKLQGKSLFGFSGSHSYSPITVESDF 

SNPLYEAGDTREYEVSI 


386 


1736 


A 


3250 


5725 


3984 


GTSTVTN1ATKKHFSIILNLLGMLLXKDNQDT 

RKLLMTWALEVAWMKKSETYAPLFCLPSF 

HKFCKGLLADTLVEDVNICLQACSSLHALSSS 

LPDDLLQRCVDVCRVQLVHRGTCIRQAFGKL 

LKSIPLGVFLSNNNHTEIQEISLALRSHMSKAP 

SNTFHPQDFSD/VI SFIL YGNSHRTGKDNWLE 

RLFYSCQRLDrCRDQSTLPRNLLKTDAVLWQW 

AIWEAAQFTVLSKLRTPLGRAQDTFQTIEG1IR 

SLAGHTLNPDQDVSQWTTADNDEGHGNNQL 

RLVLLLQYLENLEKLMYNAYEGCANALTSPP 

KVIRTFLYTNRQTCQDWLTRIRLSIMRVGLLA 

GQPAVTVRHGFDLLTEMKTTSLSQGNELEVSI 

MMVVEALCELHCPEAIQOIAVWSSSIVGKHL 

L WrNS VAQQ AEGRFEKAS VEYQEI ILCAMTG 

VDCCI S SFDKS VLTLAS AGCKS AS LKHCLNGE 

SRKSVLSKPTDSSPEVtNYLGNKACECYISTA 

DWAAVQEWQNAIHDLKKSTSSTSLNLKADF 

NYIKSLSSFESGKFVECTEQLELLPGENINLLA 

GGSKEKJDMKKLLRKM 


387 


1737 


A 


3255 


380 


76 


MDIFLYNCKYQVQTEr*NSIQHIMAVSKKLSRF 
LKYVHNL* AENYKTLMK* INEDLNKQRDVPY 
S* TARLNKMSIPTKTIFRFKATYIKIPAT YF1ET 
NMQ 


388 


1738 


A 


3260 


685 


428 


PQWLGLQVYALPPANFVFFVEMRSTILAQTG 
FELLDSSDLPASASKSAGITCMSHHARTLSLK 
*WPFCLSATQEKFC*PASEGVAW 


389 


1739 


A 


3269 


1 


332 


LDGYHTPIYMLNR1IRLPAAL* IISDQTGHALTI 
LTRLETQMINAD YQNKLTLD YLLTTDREVY E 
PFNLTNYCLI HHNQRLG A YDLG * V*Q/KL AH V 
P VQ V* HGFDPE AMFR 


390 


1740 


A 


3270 


2 


372 


GRCHDQNKGKSVDGPDAQAEACGGESTYQEL 
LVNQNP1GQPLACRRLTRKIYEGIKKAVKPNH 
SPRGVKKVHKFVNKGEKG1MVLAGDTLGIGV 
YCLLPCMC* DRKLTY AHIPSTTDLG AG AG Y 


391 


1741 


A 


3273 


1 


187 


FFQEMLDIMKAISDMMGKCITPVLKEDAPRQ 

HVETFFQVEELTRSQF.GMK1 -GENFLNIFAMPP 
DDSKESFCGK*FFQEMLDIMKAISDMMGKCTY 
PVLKEDAPRQHVETFFQVGINQKSRGHEVRR 
KFPDVCHAPR 


392 


1742 


A 


3281 


901 


521 


FFFGDGVSPCRQAGV*WHDLDSLQNLPPGFK 
RFS YLSLPSSW\D YRHVLPRQANFCIF/M * RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
ArrQMDr 1 r ALLL-r ALKGCLrKQK±.GG I LNL1 


393 


1743 


A 


3283 


385 


3 


RNRSWPEFVLLGLSAGPQTQTLLFVLFWIC 
LLTVMGNIiLLVVTNADSCLHTPMYFFLGQL 
SFLDLCHS S VT APKLLENLLSEKKTI S VEG CM 
A* VFFVFATGGTESSLLAVMAYDRYVAIRTR 
G 


394 


1744 


A 


3284 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLI ILGKC 
LDNCPEGLEANNHTMECVSIVHCEVSEWNP 
W SPCTKKGKTCGFKRGTETR VRE IIQHP S AKG 
NLCPPTNETRKCTVQRKK.CQKGERGKKGRE 
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RKRKKPNKGESKEA1PDSKSLESSKEIPEQREN 
KQQQ 


395 


1745 


A 


3286 


1 


340 


RVLYVPSMGFC1LVAHGWQK1STKSVFKKLS 
WICXSMV1LTHSLKTFHRNWDWESEYTLFMS 
ALKVNKNNAKLWNNVGHALENEKNFERAL 
K YFL QATHVQPDDJ G AHMNVGR 


396 


1746 


A 


3293 


1 


172 


GFRAWMTVKTEAAKGTLTYSRMRGMVAn. 
1AFMKQRRMGLNDFIQK1ANNSYACKQ 


397 


1747 


A 


3295 


12 


401 


AEPACGASSCTPPSLRSSSSQSVGPLRPGRPL 

WSEACAFL*AAAPQGPASPCCGLPSGFPRVW 

AQCCPPGGALRFPEGLGSVLSPRRCPQVSRGS 

GLSAVPQEVPSGFLGPGLRACPQEAPSRFLRA 

GLT 


398 


1748 


A 


3300 


1912 


2768 


KQRRWQNIQRKGPKRYIVlAGNSQSHQPMrFS 

MLRKLPKVTCRDVLPEIRAICIEEIGCWMQSY 

STSFLTDSYLKYIGWTLHDKHREVRVKCVKA 

LKGL YGNRDLT ARLELFTG RFKDWMV SMI V 

DREYS VAVEAVRLLILILKNMEGVLMDVDCE 

SVYPIV*ASN*GLASAVGEFLYWKLFYPECEI 

RTMGGREQRQSPGAQRTFFQLLLSFFVESKSH 

SVTQAGVQWQFSAHRDLCLPGSSNSHVSASR 

VAGIAGAHRHTWLIYVFFSWRQGFAVLAGL 

VSNS 


399 


1749 


A 


3301 


536 


2391 


LRSYGCKAPSRISHLHKAFLFLLLPSLLMGYSE 

SPPPITDSWAPFJSLTHHVLSQSQSPLSSNCWI 

C L STHTQ* FT AJL PAD LLT WTQ SN VS L HI S YL AI 

PFLADSFLKPV/L*PGNSAKHLSFKX J SSLSMVS 

GRAVALLHLIASGLTS1QTNTASSKPPIWGY\L 

STQTSFISPPPLCLSRTYPNPAHATMVGQVPQ 

SLCGLIFTL/RTPCRPSILHPNYKIISTSAWQKV 

LCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAAN 

S AL YVSSLKGPPGKNVTrPSPVTGT* QPPHRGS 

N/RLTVDKDNFFLSPKPNSLHQLPSQ\TPYQAL 

TG AAL AG S YPI WENENTLS WLPTFTYNFCLS T 

PSLFFLCDTN*YLCLPANWSGTCTLVFQAFT1 

NILPPNQTIL IS VE ASIS S SPIRNK WALHLITLLT 

GLGITAALGTGIAGITTSrrSYQTLFTTLSNTVE 

DMHTSrrSLQRQLDFLVGVILQNWRVLDLLT 

TEKGGTCIYLQEECCFCVNESG1VHIAVRRLH 

DRAAEL*HQVADSWWQGSSLLRWIPWVAPF 

LGPLIFLFLLLMIGPCIFNL VSRFISQRLNCFIQ 

ASMQKHIDN1FHLCHV*YQSLRGNHSEAPEPR 

P 


400 


1750 


A 


3303 


2 


453 


THWRHSSGVPGSTTARRRRRELEIATSDNQE 
YYNRLCQEVTNRERNDQKMLADLDDLNRTK 
KYLEERLIELLRDKDALWQKSDALEFQQKLS 
AEERVVLGDTEANHCLDCKREFSWMVRRHHC 
R1CGRIFCYY CCNNYVLSKHGGKKERCC 


401 


1751 


A 


3304 


1 


626 


MAPQHSSLDDKVPQQASTVCFEFQD1LQHSQ 

CTEHKDSLWGPGARSQPFGAJINTRLSPDSCP 

EKIVLRALKDSRAGMPEQDKDPGVQENPDD 

QRRVPQGTGDAPSAFRPLWDNGGLSPFVSRP 

GPLERDLHAQRSEVTYNQRSQS S WMSSFPKR 

NAFVSPYSSMGQAQP/GLPKTNP1GESCCWEG 

LSLSTQILG*QKPSKYIPSLCKR 


402 


1752 


A 


3305 


1678 


172 


MELPSGPGPERLFDSHRLPGDCFLLLVLLLYA 
PVGFCLLVLRLFLGIHVFLVSCALPDSVLRRF 
WRTMCAVLGLVARQEDSGLRDHSVRVLISN 
H VTPFDFTNrVNLLTTC STVSE SE AES ATGRFP 



175 



Printed from Mimosa 03/03/06 11: 10:20 Page: 176 



WO 01/57188 



PCT/U SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A— Alanine C-Cystcinc, 
D^Aspartic Acid, E^Olutamic Acid, 
F=Phenylalamne, G=<jlycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
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GAQLKAPLSPLAFRMEDTEALPLTPILYPTCQ 
FFFFMFLN1FLLAFSSPGSQPLLNSPPSFVCWSR 
GFMEMNGRGELVESLKRFCASTRLPPTPLLLF 
PEEE ATNGREGLLRFSS WPFSIQD WQPLTL Q 
VQRTL VS VTVSDAS W VSELLK WSLFVFFTVY 
QVRWLRPVHRQLGEANEEFALRVQQ\LVAKE 
LG\QTGTRLTPA\DKAEHMKRQRHPR\LRPQS 
AQS SFPPSP W VLSS/SDVQTGQTLG FREFKESF 
CPHVAIGVFrPERPWPKTGCCKTLTIHLILL*G 
GPVSFSCPFADIHPRGT*VPTQQASGLPSFPSYG 
PARGGVL* HPSAQQPLTFA\KSS\WARAGRAL 
QERKQ\AJLYEYARRRFTERRAPGGLD 


403 


1753 


A 


3307 


44 


447 


DPSPSLLAVALGLRAGERTRSGPGSSSPSGGIS 

GGASAGLASSPECACGRSHFTCAVSALGECT 

CIPAQWQCDGDNDCGDHSDEDGCILPTCSPL 

DmCDNGKCIRRSWVCDSDNDCEDDSDEQD 

CPPRECEED 


404 


1754 


A 


3311 


409 


1 


PRHGWGRRVLGRDRPRLQKVKKSVKAIYrPG 
QDHVQNEErVARVLDKFGSNFLSRDNADLGT 
AFVKPSTLTK*LSALLKNLLQGLSRNVIFTLDS 
LLKGDLKGVKGDLKKPFDKAWKDYETKFAK 
IEKEKREREWR 


405 


1755 


A 


3322 


12 


458 


AAVPVENPWDDPRVRPRVRIFTWEDCIAGQA 
KVLCNDSYGVTIDWSPKGAFIRLTSQSVGNG 
HPASKENDQMVDTIKNTTKVPIIWTYGDMVE 
PRPQMIRP A VG AKHKELWKJ LMALKKIKAIWE 
GK YTKPS Q YNPNYMLEL AHNDSVW 


406 


1756 


A 


3324 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAV 
MCVLLWALSLLQSILEWMFCSFLFSDVDSDN 
WCQILDFLTAVWLIFLIVLVLCGFTLVLLVRIIC 
GSQKMPLTRLYVTILLTGLVFLFCSLPLSIQ+F 
LLYWIEKDLDDL 


407 


1757 


A 


3328 


213 


1841 


SGDLSPAELMMLTIGDVIKQLIEAHEQGKD1D 

LNKVKTKTAAICYGLSAQPRLVDI1AAVPPQY 

RKVLMPKLKAKPIRTASGIAVVAVMCKPHRC 

PHISFTGN1CVYCPGGPDSDFEYSTQSYTGYEP 

TSMRAIRARYDPFLOTRHRIEQLKQLGHSVD 

KVEFIVMGGTFMALPEEYRDYFIRNLHDALS 

GHTSNNIYEAVKYSERSLTK.CIGITIETRPDYC 

MKRHLSDMLTYGCTRLEIGVQSVYEDVARD 

TNRGHTVKAVCESFHLAKDSGFKVVAHMMP 

DLPNVGLERDIEQFTEFFENPAFRPDGLKLYP 

TLVIRGTGLYELWKSGRYKSYSPSDLVELVA 

RILALVPPWTRVYRVQRDIPMPLVSSGVEHG 

>JLRELALARMKDLGIQCRDVRTREVGIQEIH 

HKVl^YQVELVRRDYVANGGWETFLSYEDP 

DQDILIGLLRLRKCSEETFRFELGGGVSIVREL 

HVYGSWPVSSRDPTKFQHQGFGMLLMEEA 

ERIAREEHGSGKIAVISGVGTRNYYRKIGYRL 

QOr Y M V K.MLN 


408 


1758 


A 


3335 


3 


467 


AIASPRAAG1RHELTSTMAAGKNKRLTKGGK 
KGAKKKAV/DNIINIGKTLVTRTQRTKIASDG 
LKGR VFEESL ADLQND\TDG YLLRVI* VAFTT 
ERTNQl/REVFNKLIPDSIGKDIEKACQSIYPLH 
DDFARKVKMLKKPKFELRKLMELHGEGSS 


409 


1759 


A 


3338 


7 


1252 


PRWRNSARDEILLSFPQNYYIQWLNGSLIHGL 
WNLASLFSNLCLFVLMPFAFFFLESEGFAGLK 
KGIRARILETLGMLLLLALLILGIVWVASALID 
NDAASMESLYDLWEFYLPYLYSCISLMGCLL 
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I J J.CTPVGL\SRMFrVMGQLLVXPTTLEDLD E 
QrYIITLEEEALQRFTKWAVFIRW/KYNIMELE 
QELENVKTLKTKLERRKKASAWERNLVYPA 
VMVLLLIETS1 S VLLV ACNILCLL VDET AMPK 
GTRGPGIQNASLSTFGFVGAALEIILIFYLMVS 
SWGFYSLRFFGNFTPKKDDTTMTKIIGNCVS 
ILVLSSALPVMSRTLGITRFDLLGDFGRFNWL 
GNFYIVLSYNLIJJArVTTLCLVRKFTSAVREE 
LFKALGLHKLHLPNTSRDSETAKPSVNGHQK 
AL 


410 


1760 


A 


3339 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLL 

WLAr^CSPVHTTLSKSDAKKAASKTLLEKSQ 

FSDKPVQDRGLVVTDLKAESWLEHRSYCSA 

KARDRHFAGDVLGYVTPWNSHGYDVTKVFG 

SKFTQISPVWLQLKRRGREMFEVTGLI IDVDQ 

G WMRAVRKHAKGL\P * CLGSCLRTGLTMI SG/ 

YVLDSEDEIEELSKTVVQVAKNQHFIXjPVVE 

VWNQLLSQKRVGLIHMLTHLAEALHQARLL 

ALLV1PPAITPGTDQLGMFTHKEFEQLAPVLD 

GFSLMTYDYSTAHQPGPNAPLSWVRACVQV 

LDPKSKWRSKILLGLNFYGMDYATSKDAREP 

VVGARYIQTLKDHRPRMVWDSQVSEHFFEY 

KXSRSGRJiVVFYPTLKSLQVRLELARELGVG 

V SIWELGQGLD YFYDLL* VGIAAS A VD VFFSK 

PWSE 


411 


1761 


A 


3342 


74 


2701 


V ATRKL AKGFT QF AKMTEGTKKTSKKJF KFFK 

FKGFGSFSNLPRSFTLRRSSASISRQSHLEPDTF 

EATQDDMVTVPKSPPAYARSSDMYSHMGTM 

PRPSIKKAQNSQAARQAQEAGPKPNLVFGGV 

PDPPGLEAAKEVMVKATGPLEDTPAMEPNPS 

AVEVDPIRKPEVPTGDVEEERPPRDVHSERAA 

GEPE AGSDYVKFSKEK YILD S SPEKLHKELEE 

ELKLS STDLRSHA WYHGRIPREV SETL VQRN 

GDFLIRDSLTSLGDYVLTCRWRNQALHFKIN 

KVWKAGESYTHIQYLFEQESFDHVPALVRY 

HVGSRKAVSEQSGAIIYCPVNRTFPLRYLEAS 

YGLGQGSSKPASPVSPSGPFCGSHMKRRSVTM 

TDGLTADKVTRSDGCPTSTSLPRPRDSIRSCA 

LSMDQIPDLHSPMSPISESPSSPAYSTVTRVHA 

APAAPS ATALPA SPV ARRSSEPQLCPG S APKT 

HGESDKGPHTSPSHTLGKASPSPSLSSYSDPDS 

GHYCQLQPPVRGSREWAATETSSQQARSYGE 

RLICELSENGAPEGDWGKTFTVPIVEVTSSFNP 

ATFQ S L L IPRDNRP LE VGL LRK VKE LL AE VD A 

RTLARHVTKVDCLVARILGVTKEMQTLMGV 

RWGMELLTLPHGVRKLRLDLLERFHTMSIML 

AVDILGCTGSAEERAALLHKTIQLAAELRGT 

MGNMFSFAAVMGALDMAQISRLEQTWVTLR 

QRHTEGAILYEKKLKPFLKSLNEGKEGPPLSN 

TTFPHVLPL1TLLECDSAPPEGPEPWGSTEHGV 

EWLAHLEAARTVAHHGGLYHTNAEVKLQG 

FQARPELLEVFSTEFQMRLLWGSQGASSSQA 

RRYEKFDKVLTALSHKLEPAVRSSEL 


412 


1762 


A 


3347 


1 


893 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLM 

VLYLIKKRLVACAAVFYGFAVHMKIYPETY1 

LPrrLHLLPDRDNDKSLRQFRYTFQACL*ELL 

KRLCNRTALMFVAVAGLTFFALSFGFYYEYG 

WEFLEHTYFYHLTRRDIRHNFSPYFYMLYLT 

AESKWSFSLGIAAFLPQLILLSAVSFAYYRDL 

WCWFLrTTSIFVTFNKVCTSQYFLWYLCLLPL 
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VMPLVRMPWKRAVVLLMLWFIGQAMWLAP 
A Y VL EF Q GKNTFLFI WL A GLFFLLINC SILI Q 1 1 
SHYKEEPLTERIKYD 


413 


1763 


A 


3361 


3 


474 


PIPVRWNSLEGRLLRGYEQHANDGKDYISRN 

♦DLRSWTAADMAAQ1TKRKWEAEEFAEQIKA 

YLEGTCVER/LRTHLENGKETLQLTEQSSQPTI 

P1VGIVAGLVLLGAWTGAWSAVMCRKKNS 

GHFLPTDRVSYSEAASSDHAQGSDVSLTACK 

V 


414 


1764 


A 


3363 


1488 


453 


HQILELKKKILKTYNPD YDEDL VQE AS SEDV L 

G VHMVDKLD TERDIEMKRQLRRLRELi 1LYST 

WKKYQEAMKTSLGVPQRERDEGSLGKPLCP 

PE1LSETLPGSVKKRVCFPSEDHLEEFIAEHLP 

EASNQSLLTVAHADAGTQTNGDLEDLEEHGP 

GQTVSEEATEVHMMEGDPDTLAELLIRDVLQ 

ELSSYNGEEEADPEEVKTSLGVPQRGDLEDLE 

EFWPGQTVSEEATGVHMMQVDPATLAKSDL 

EDLEEHVPEQTVSEEATGVHMMQVDPATLA 

KQLEDSTITGSHQQM S ASPS SAP AEEATEKTK 

VEEEVXTRKPKKKTRKPSKKSRWNVLKCWD 

IFN1F 


415 


1765 


A 


3369 


431 


315 


IP WS WVGRLS VRKMSILF *LTYN YNA1LNKTP 
PSFSPSL 


416 


17&S 


A 


3373 


42 


651 


RQEKMGLGE1GASGVLRSMLKERKKQNMKG 
NGNVTLTPLLPAVQCGCHLQPAGRSPLPSSHS 
APGLCSPLHPLQPQQEASTCPSGTLQGREKAA 
PGQGRPLCSLWAGGAGAYPGERGAEGRGPSD 
QAPDPKSGPWLFPPGLGAPAEVRLHNVPHNL 
RRPPLP* ARGK* PPNSGCPWSEGRAKQPL SCG 
PKPQCSLPSQVPGDTH 


417 


1767 


A 


3382 


2 


2061 


EAQDPRACGPDAGGRFAARDAPGNSLRPPPS 

SPP/GWPGQLRLLPRVPGSELRCGKPERGRLP 

ASPPGKIRGWPPGISKRPGLGGRSFPPGFAPRT 

WRPE ARGPSVQ SLPPIFSPQS AQTT AR* RPG AP 

KNAGRCGGA\RGPRLSLGPPPGPPPAPALPAR 

ASAGAGAAAAALAVGGVRGAGGARGTGGY 

GHCSGR/PTGRTGPGPQGPGPPMPARPR* AS\S 

TRGSRRGPGSRPARAAAAPRAGDHGRRPVRV 

HLRQHT A V* EPRLGDATAPPGG AAGPG AP AP 

R\GPGWDCALLPSPGPRSPRAVGCAEPEIWDP 

SPRRGTSPVPSVRSLRSEPANPRLGLPALLNSY 

PLKGPGLPPPWGPRTQTGHVnTVQPSGSCIEH 

SKSLD/RGPWGAPPWGPSSSGLCSPKLATAGP 

PQSWGLCQIGRRRGLGGPGLKRGET/GLL+GC 

SMDHANRTKGPGVPTSNRCFSHIPG\GDGCSD 

HS SCEGHPDLHAGREMPAAPGL SELER VRFT 

VGCGGLASGISSASVSGLSPNRAGGPGQGDW 

EMYPVSWQTQESGGQG/SPKTGR'VGMLQA 

GAGSLQGGTGDGVWGL WEDGP/RG* DSPLPS 

GTGTEP* TPTTSIPFFPQPSG VY PSRATLLPMPS 

Y'ALGPSANKSEKFLLSFLYRGLCCRISLQLA 

KGIGQLSEIPLLNVETAFWSMWVTYFRK 


418 


1768 


A 


3398 


304 


2121 


EEEEEEEDEDDDDNNEEEEFECYPPGMKVQV 
RYGRGKNQKMYEASIKDSDVEGGEVLYLVH 
YCGWNVRYDEW1KADKJVRPADKNVPKIKH 
RKXIKNKLDKEKDKDEKYSPKNCKPPALGPN 
PPFQTNPIS WKWYPKLDLTDAKNS DTAHIK S I 
ErrSILNGLQASESSAEDSEQEDERGAQDMDN 
NGKEESKIDHLTKNRNDLISKEEQNSSSLLEE 
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NKVHADLVISKPVSKSPERLRKDIEVLSEDTD 

YEEDEVTKKRKDVKKDTTDKSSKPQIKRGKR 

RYCNTEECLKTOSPGKKEEKAKNKESLCMEN 

SSNSSSDEDEEETKAKNfTPTKKYNGLEEKRK 

SLRTTGFYSGFSEVAEKRIKLLNNSDERLQNS 

RAKDRKDVWSSIQGQWPKXTLKELFSDSDTE 

AAASPPHPAPEEGVAEESLQTVAEEESCSPSV 

ELEKPP P VN VD S KP IEEKT VE VNDRKAEFPS S 

GSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSS 

VTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGE 

LQDLQSERE* LASRF*CQCELKQ* * SARTRTS* 

KSL YRSEKSERCSGRRKFIKKAEKKP* SNS GK 

QQKEGK 


419 


1769 


A 


3399 


206 


463 


QRECLSIHIGQAGIQIGDACWELYCLEHGIQP 
NGVVLDTQQDQLENAKMEHTNASFDTFFCE 
TR A GKHVPRAJLFVDLEPTVTDGIR 


420 


1770 


A 


3408 


1010 


685 


RRLSFFF*IWSSVLVTQARVQWRDLGSPQPLP 
PGFKRFSCLSLPSSWDYRHPSPRPVNF/HVFLV 
VMGFHHVGQAGLELLTSGDLPALASQSARIT 
GVNHCAQPRGHFH 


421 


1771 


A 


3409 


355 


1326 

• 


ADSNLIESCWQELGLGPWGGDWRVEQVGAS 

ASLRFPREVCS1RFLFTAVSLLSLFLSAFWLGL 

LYLVSPLENEPKEMLTLSEYHERVRSQGQQL 

QQLQAELDKLHKEVSTVRAANSERVAKLVF 

QRLNEDFVRKPDYALSSVGASIDLQKTSHDY 

ADRNTAYFWNRFSFWNYARPPTVILEPHVFP 

GNCWAFEGDQGQWIQLPGRVQLSD1TLQHP 

PPSVEHTGGANSAPRDFAVFFLLSFFTHQGLQ 

VYDETE VSLG KFTFD VEKSEIQTFHLQNDPP A 

AFPKVK1QILSNWGHPRFTCLYRVRAHGVRT 

SEGAEGSAQGPH 


422 


1772 


A 


3412 


2 


421 


EFDAQPSIGALWFKRP+ATTGSDPGPKRGMN 

YLVSCSMRSPESGKGEPGTARDYTPMGRPPP 

PVPSVSPGPLPGSLA1APHSPEPHPWEQQPPRG 

QARSPPGGWLGSAT/RVRRPHNHP/RGH/HSP 

VDTAGAPASPGPDVCE 


423 


1773 


A 


3420 


91 


706 


DAQRA1YSSVGPAVSLRQRQQDGAVKESGR/ 
RGGVRSFSRAAAAMAPIKVGDAIPAVEVFEG 
EPGNK VNLAELFKGKKGVLFG VPGAFI PGCS 
KTHLPGFVEQAJEALKAKGVQVVACLSVNDA 
FVTGEWGRAHKAEGKVRLLADPTGAFGKET 
DLLLDDSLVSIFGNRRLKRFSMVVQDGIVKA 
LNVEPDGTGLTCSLAPNIISQL 


424 


1774 


A 


3421 


4 


7688 


RQVTRVGTRVLGSTTAA VFL S VEDDNDN APQ 

FSEKRYWQVREDVTPGAPVLRVTASDRDKG 

SNAVVHYSIMSGNARGQFYLDAQTGALDVV 

SPLDYETTKJEYTLRVRAQDGGRPFLSNVSGL 

VTVQVLDrNDNAPIFVSTPFQATVLESVPLGY 

LVLHVQAIDADAGDNARLEYRIAGVGHDFP 

FTTNNGTGWI S V AAEUDREEVDFYSFG VEAR 

DHGTPALTASASVSVTALDVNDNNPTFTQPE 

YTVRLNEDAAVGTSVVTVSAVDRDAHSVITY 

QrrSGNTRNRFSITSQSGGGLVSLALPLDYKLE 

RQYVI^VTASDGTRQDTAQIVVNVTDANTH 

RPVFQ S S HYTVN VNEDRPAGTTWLIS ATDE 

DTGENARTTYFMEDSIPQFRIDADTGAVTTQA 

ELDYEDQVSYTl^TARDKGIPXiKSDTTYLEI 

L VND VNDN APQFLRD S Y QG S VYED VPPFTSV 

LQISATDRDSGLNGRVFYTFQGGDDGDGDFI 



179 



Printed from Mimosa 03/03/06 11:10:24 Page: 180 



WO 01/57188 PCT/U SO 1/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A— Alanine OCysteine, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Jsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W-Tryptophan, 
Y— Tyrosine, X—Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=posiiWe 
nucleotide insertion 














VESTSGIVRTLRRLDRENVAQYVLRAYAVDK 

GMPPARTPMEVTVTVLDVNDNPPVFEQDEFD 

VFVEENSPIGLAVARVTATDPDEGTNAQIMY 

QIVEGNIPEVFQLDIFSGELTALVDLDYEDRPE 

YVLVIQATSAPLVSRATVHVRLLDRNDNPPV 

LGNFEILFNN Y VTNRS S SFPGG AIGR VPAHDP 

DISDSLTYSFERGNELSLVLLNASTGELKLSR 

ALDNNRPLEAIMSVLVSDGVHSVTAQCALRV 

TIITDEMLTHSITLRLEDMSPERFLSPLLGLFIQ 

AVAATLATFPDHVVVFNVQRDTDAPGGHILN 

VSLSVGQPPGPGGGPPFLPSEDLQERLYLNRS 

LLTAISAQRVLPFDDN1CLREPCENYMRCVSV 

LRFDSSAPFIASSSVLFRPIHPVGGLRCRCPPGF 

TGDYCETEVDLCYSRPCGPHGRCR5REGGYT 

CLCRDGYTGEHCEVSARSGRCTPGVCKNGGT 

CVNLLVGGFKCDCPSGDFEKPYCQVTTRSFP 

AHSFITFRGLRQRFHFTLALSFATKERDGLLL 

YNGRFNEKHDFVALEVIQEQVQLTFSAGEST 

TTVSPFVPGGVSDGQWHTVQLKYYNKPLLG 

OTGLPQGPSEQKVAVVTVDGCDTGVALRFGS 

VLGNYSCAAXQGTQGGSKKSLDLTGPLLLGG 

VPDLPESFPVRMRQFVGCMRNLQVDSRHIDM 

ADFIANNGTVPGCPAKKNVCDSKTCHNGGTC 

VNQWDAFSCECPLGFGGKSCAQEMANPQHF 

LGSSLVAWHGLSLPISQPWYLSLMFRTRQAD 

GVLLQAITRGRSTITLQLREGHVMLSVEGTGL 

QASSLRLEPGRANDGDWHHAQLALGAIGGP 

GHAILSFDYGQQRAEGNLGPRLHGLHLSNITV 

GGIPGPAGGVARGFRGCLQGVRVSDTPEGVN 

SLDPSHGE S INVEQGCS LPDPCDSNPCP AN S Y 

CSNDWDSYSCSCDPGYYGDNCTNVCDLNPC 

EHQS VCTRKPS APHG YTCECPPNYLGP YCET 

RIDQPCPRGWWGHPTCGPCNCDVSKGFDPDC 

NKTSGEC HCKEN HYRPPG S PTC LI CDCYPTG 

SLSRVCDPEDGQCPCKPGVIGRQCDRCDNPF 

AEVTTNGCEVNYDSCPRAIEAGIWWPRTRFG 

LPAAAPCPKGSFGTAVRHCDEHRGWLPPNLF 

NCTSITFSELK.GF AERLQRNESGLD SGRSQQL 

ALLLRNATQI ITAGYFGSDVKVAYQLATRLL 

AHESTQRGFGLSATQDVHFTENLLRVGSALL 

DTANKRHWELIQQTEGGTAWLLQHYEAYAS 

AIAQNMRHTYLSPFTTVTPNIVISVVRLDKGN 

FAGAKLPRYEALRGEQPPDLETTVILPESVFR 

ETPPWRPAGPGEAQEPEELARRQRRHFELSQ 

GEAVASVIIYRTLAGLLPHNYDPDKRSLRVPK 

RPIINTPWSISVHDDEELLPRALDKPVTVQFR 

LLETEERTKPICVFWNHSILVSGTGGWSARGC 

EVVFRNESHVSCQCNHMTSFAVLMDVSRRE 

NGEILPLKTLTYVALGVTLAALLLTFFFLTLL 

RILRSNQHGIRRNLTAALGLAQLVFLLGINQA 

DLPF ACTVI AILLHFL YLCTFS W ALLEALHL Y 

RALTEVRDVNTGPMRFYYMLGWGVPAFITG 

LAVGLDPEGYGNPDFCWLSrYDTLIWSFAGP 

VAFAVSMSVFLYILAARASCAAQRQGFEKKG 

PVSGLQPSFAVLLLLSATWLLALLSVNSDTLL 

FHYLFATCNCIQGPFIFLSYWXSKEVRKALK 

LACSRKPSPDPALTTKSTLTSSYNCPSPYADG 

RLYQTAYGDSAGSLHSTSRSGKSQPSYTPFLLR 

EESALNPGNQGPPGLGGIPGR/I.CFLGRFKDQQ 

HVDS'TRDFDSDLSLEDDQSGSYASTHSSDSEE 
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EEEEEEEEAAFPGEQGWD SLLGPG AERLPLH S 
TPKDGGPGPGKAPWPGDFGTTAKESSGNGAP 
FJERLRENGD ALSREG SLGPLPGSS AQPHKGIL 
KKKCLPTISEKSSLLRLPLEQCTGSSRGSSASE 
GSRGGPP S RPPPRQSI .QEQLNG VMP I AMSIK A 
GTVDEDSSGSEFLFFNFLH 


425 


1775 


A 


3429 


155 


1417 


GEPAVQSCDCGCTQRSCPWLLVAPGLLSSSSS 

RAAS VRE AEDAPLQPASIHP V S QG SRGPEG SL 

G S AECLPGDPLG ARRATRAHSP VPGPPPSLP A 

AGTAVKRGLQPG*G A/GATSTPGTG AATGGI . 

CGPAWAAPSAVGPCCCCPSISTTPSQMRSARP 

SLGCLPSWAS\PGTEHPPGPQGPGPS*DLCSV* 

KREFQRGPWAGMVELHRISAADPARAPGPDS 

NLQ S ALQQPATGCSEP AA VYSPPIGL WG A* * P 

EYG*PQHSLPG*TAPADR*P\AGIKDRVYSNSI 

YELLENGQRAGTCVLEYATPLQTLFAMSQYS 

QAGFSREDRLEQAKLFCRTLEDILADAPESQN 

NCRLIAYQEPADDSSFSLSQEVLRHLRQEEKE 

EVTVGSLK.TSAVPSTSTMSQEPELLISGMEKP 

LPLRTDFS 


426 


1776 


A 


3431 


1662 


369 


AI W WL S WLQHDLLPTPTQ V AIDFTA SNGDPR 

SSQSLHCLSPRQPNHYLQALRAVGGICQDYD/ 

SVGESGAGGNRQGGLAQRIPQLFLLPSDKRFP 

AFGFGARIPPNFEVG* MRGKEGDGGRVSQAE 

KAGPHCSRLALTGXSHDFATNFDPENPECEGK 

RGDFHLPRLPADTLHTGAQTPLPRAQLPVPST 

HPRPVFT\EISGVIASYRRCLPQIQLYGPTNVAP 

IINRVAEPAQREQSTGQATKYSVLLVLTDGV 

VSDMAETRTAIVRASRLPMSIIIVGVGNADFS 

DMRLLDGDDGPLRCPRGVPAARDrVQFVPFR 

DFKDVSPPGPFRLKDSSASHPPKSDLRLPPFD 

VLLRTREPSWPP* SPTSPSDDPASPTLPLTPNHI 

T VPTL\AAPS AL AKC VL AE VPRQVVEYYASQ 

GISPGAPRPCTLATTPSPSP 


427 


1777 


A 


3446 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAVAEEPLHRPKXELSATKKDRVNHCLTIC 

ENJVAQSVRNSPEFQKLLGIAMELFLLCSDDA 

ESDVRMVADECLNKV1KALMDSNLPRLQLEL 

YKE1KKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNFANDNEIKVLLKAFIANLKSSSPTI 

RRTAAGSAVSICQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLLILGVLLTLRYLVPLLQQQV 

KDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNVVTGALELL QQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AGGGSSCSPVLSRKQKGKVLLGEEEALEDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQTTTEGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLVHCVRLLSASFLLTGGKNVLVPDKDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 
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TTEYPEEQYVSDILNYIDHGDPQVRGATAILC 

GTLICSILSRSRFHVGDWMGTIRTLTGNTFSL 

ADCIPLLRKTLKDESSVTCKLACTAVRNCVM 

SLCSSSYSELGLQLUDVLTLRNSSYWLVRTEL 

LETLAEIDFRI , VSFLEAKAENLHRG AHHYTGL 

LKLQERVLNNVVEHLLGDEDPRVRHVAAASL 

IRLWKiFTKCDQGQADPVVAVARDQSSVYL 

KLLMHETQPPSHFSVSTITRIYRGYNLLPSITD 

VTMENNLSRVIAAVSHELITSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMELTLLSSAWFPLDLSAHQDAL 

ILAGNLLAASAPKSLRSSWASEEEANPAATK 

QEEVWPALGDRALVPMVEQLFSHLLKVINIC 

AHVLDDVAPGPAIKAALPSLTNPPSLSPIRRK 

GKEKEPGEQASVPLSPKKGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLGSSS VRPGLYI IYCFMAPYTHFTQALAD A 

SLRNMVQAEQENDTSGWFDVLQKVSTQLKT 

NLTSVTKNRADKNAIHNHIRLFEPLVIKALKQ 

YTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 

DSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFF 

FLVLLSYERYHSKQHGIPKilQLCDGIMASGR 

KAVTHAIPALQPrVHDLFVLRGTNKADAGKE 

LETQKEVWSMLLRLIQYHQVLEMFILVLQQ 

CHKENEDKWKRLSRQIADIILPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILAILRVLISQSTED 

IVLSRJQELSFSPYLISCTVTNRLRDGDSTSTLE 

EH SEGKQ1KNLPEETFSRFLLQL VGILLEDPVT 

KQLKVEMSEQQHTFYCQELGTLLMCLIHIFKS 

GMFRRITAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMTTTHPALVLLWCQILLLVNHTDYRWW 

AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWXrVNHIQDLISLSHEPPVQDFISAVHRNS 

AASGLF1Q AIQ SRCENL STPTMLKKTLQCLEG I 

HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 

ACRRVEMLLAANLQSSMAQLPMEELNRIQEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQC WTRSDS ALLEGAEL VNRIPAEDMN AFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LVWSKLPSHLHLPPEKEKDIVKFWATXEAL 

S WHL1HEQIPLSLDLQ AGLDCCCL ALQLPGL 

WSVVSSTEFVTHACSLIYCVHFILEAVAVQPG 

EQLLSPERRTNTPKAISEEEEEVDPNTQNPKYI 

TAACEMVAEMVESLQSVLALGHKRKSGVPA 

FLTPLLRNniSLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGTAFFEIPVEFLQEKEVFKEFIYR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGIVEQEJQAMVSKRENIATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQVSIHSVWLGNSITPLREEEWDEFEEEE 
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ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 

LELYSRWILPSSSARRTPAILISEWRSLLWS 

DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RSSHLPSRVG ALHGIL YVLECDLLDDT AK.QL I 

PVISDYLLSNLKGIAHCVNIHSQQHVLVMCAT 

AFYLIENYPLDVGPEFSASIIQMCGVMLSGSE 

ESTPSIIYHCALRGLERLLLSEQLSRLDAESLV 

KiSVDRVNVHSPHRAMAALGLMLTCMyTG 

KEKVSPGRTSDPNPAAPDSESVTVAMERVSVL 

FDRIRKGFPCEARVVARILPQFLDDFFPPQDrM 

NKVIGEFLSNQQPYPQFMATWYKVFQTLHS 

TGQSSMVRDWVMLSLSNFTQRAPVAMATWS 

LSCFFVS ASTSPW V AAILPHVI SRMGKLEQ VD 

VNLFCLVATDFYRHQ1EEELDRRAFQSVLEV 

VAAPGSPYHRLLTCLRNVHKVTTC 


428 


1778 


A 


3449 


3 


430 


NSRP S PS AALVEVLLRSG STFPHTVSGGW AA 

WGPWSSCSRDCELGFRVRKRTCTNPEPRNGG 

LPCVGDAAEYQDCNPQACPVRGAWSCWTS 

WSPCSASCGGGHYQRTRSCTSPAPSPGED1CL 

GLHTEEALCATQACPEGWS 


429 


1779 


A 


3464 


583 


3 


D ALDRRY LERCHPAAGG WVGEGE* ALCQKT/ 
RFSGVLEPPLPSLKDGGRFPAWT*RSCSKSLR 

AAFTS q ffp srrs ra s pgs ap\gn gqnlteq HP 

CPGSCDPQVLSASWM*VEHRSKFRPPP*NSTI 
PPES/RS»QGGTVQTGQHSSGREAGSWRARGR 
NAGRR*KGGGKJGTKQGAVRARKECRGEMA 
SGETDSE 


430 


["mo 


A 


3473 


2802 


270 


nFRMRJFLHCPWNQQMWKJWNLLETSLESCKA 
HLSIQKLLKER\Q\QLPVFKHRDSIVETLKRHR 
WWAGET\GSGKSTQVPHFLLEDLLLNEWE 
ASKCTvnVCTQPRRiSAVSLANRVCDELGCENG 
PGGRNSLCGYQIRMESRACESTRLLYCTTGV 
LLRKLQEDGLLSNVSyHMFIVDEV\HER\SVQS 
DFLL1ILKEILQKRSDLHLILMSATVDSEKFST 
YFTIICPiLRISGRSYPVEVFHLEDriEETGFVLE 
KDSEYCQKFLEEEEEVTINVTSKAGGIKKYQE 
YIPVQTGAHADLNPFYQKYSSRTQHAILYMN 
PHKJNXDLILELLAYLDKSPQFRNIEGAVLIFL 
PGL AHIQQL YDL L SNDRRFYSERYKVIALH S I 
L STQDQ AAAPTLPPPG VRKTVLATNIAETGITI 
PDWFVIDTGRTKENKYHESSQMSSLVETFVS 
KASALQRQGRAGRVRDGFCFRMYTRERFEQ 
FMDYSVPEILRVPLEELCLHIMKCNI.GSPEDF 
LSKALDPPQLQVISNAMNLLRKIGACELNEPK 
LTPLGQHLAALPVNVKIGKMLEFGAIFGCLDP 
VATLAAVMTEKSPFTTPIGRKDEADLAKSAL 
AMAD SDHLTIYNA YLG WKKARQEGG YRSEI 
TYCRRNFLNRTSLLTLEDVKQELtKLVKAAGF 
S S STTSTS WEGNRASQTLSFQEI ALLKA VL VA 
GLYDNVGKIIYTKSVDVTEKLACIVETAQGK. 
AQVHPSSVNRDLQTHGWLLYQEKIRYARVY 
LRETTLITPFPVLLFGGDIEVQHRERLLSIDGW 
IYFQAPVKIAVIFKQLRVLIDSVLRKKLENPK 
MS L END K.IL Q IITEl >IKTENN 


431 


1781 


A 


3474 


I 


441 


FRPAPGHVQP*GGSSAAAGGGLLSHPRPCQQ 

PCPPAPAPSRPRSLGSLGQRVPAALATAAQEL 

PATLGGDGGKPALTAGEAALPGLHRSGVPAA 

AARC*PCT/SRPT*STLSPTQAAWWCRPSRRQ 

QRGEASTGGASGRRCGSCFQV 
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Q-Ghitamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 


432 


1782 


A 


3478 


416 


23 


QLRRLTLPNFKTY/YSS*IffilAWH**KNMQID 
QWFRRESPEIDLCKY S *LSFDKE AKAIK/WKE 
CSLFNKWCATCNWM/LHVQKKRI*VQTLHPS 
QKLK\SKWIKDLNVECRITKLLDQEYPGDLGY 
SRALNSGSR 


433 


1783 


A 


3504 


1876 


552 


CLAPCSPQPEKNGMQPLLLLLPPLLYQQLLHS 

SLGAPGESTLLVRTSKLLVGLGLQLLVWLLL 

QTRSLLALQLHLTSSAPLLAAPTAVCSCSRCS 

APRSRCVARPAARTGLPTPAPASSPAPAASPA 

PAASPAPAESTA\PQPLILLPKP/PPAPGAPPPRP 

GAPPPRPAASPSPAASPAPPAASPVLTASPPLP 

AASPSPAASPAPPAASPVLTASPPLPAASPSPA 

ASPAPPAASPVLTASPPLPAASPALAASPVHT 

ASPPVHVASPPVHTASPPVHVASPPVHTASPP 

VHVASPPVHTASPHVHVASPPVHVASPPVHV 

ASPPVHTASPPVHVASPPVHTASPHVHVASPP 

VHTASPPVHVASPPVHVASPPVHVAYPPVHV 

ASPPVHVASPPVHVASPPVSCSGDSTSDCFPP 

QPGAVFPHSLAPSLGGWSHLVAALP 


434 


1784 


A 


3516 


142 


590 


GG VNRPRSETEQVKTP VLI SS WDYRHPPPRP A 
SFFVFLV*TGF\TALARMVLISWPCDLPTSASQ 
SAGITGVRHHA\RLLYFEQESHSVTQAGW\VQ 
WHNLG SLQPLSLEDRLSPG VLGC S ALCRSGV 
RTKFGrNMVTSRERGTTRLPKEG 


435 


1785 


A 


3529 


1 


3161 


M SLVRAALEALDELDLFG VKGGPQS VIH VL A 

DEVQHCQSILNSLLPRASTSKEVDASLLSWS 

FPAFAVEDSQLVELTKQEHTKLQGRYGCCRF 

LRDGYKTPKED PNRL Y Y/ENP AELKLFENIEC 

EWPLFWTYFILDGVFSGNAEQVQEYKEALEA 

VLIKGKNG VPLLPEL YS VPPDR VDEE YQNPHT 

VDRVPMGKLPHMWGQSLYILGSLMAEGFLA 

PGEIDPLNRRFSTVPKPDVWQVYPSLPHGCS 

SKSPSHQCTIISIRTTRKJTAPVSILAETEEIKTIL 

KDKGIYVETIAEVYPIRVQPARILSHIYSSLEIF 

LPFLNSVSGCNNRMKLSGRPYRHMGVLGTSK 

LYDIRKTIFTFTPQFIDQOQFYLALDNKMiVE 

MLRTDLSYLCSRWRMTGQPTITFPISHSMLDE 

DGTSLNSSILAALRKMQDGYFGGARVQTGKL 

SEFLTTSCCTHLSFMDPGPEGKLYSEDYDDN 

YD YLESGN WMNDYD STSHARCGDE V AR YL 

DHLLAHTAPHPKLAPTSQKGGLDRFQAAVQT 

TCDLMSL VTKAKEL HV QNVHMYLPTRXFQA 

SRPSFNLLDSPHPRQENQVPSVRVEIHLPRDQ 

SGEVDFKALVLQLKETSSLQEQADILYMLYT 

MKGPDWNTELYNERSATVRELLTELYGKVG 

EIRH WGLIRY1S GILRKKVEALDE ACTDLL SH 

QKHLTVGLPPEPREKTISAPLPYEALTQLIDEA 

SEGDMSISILTQEIMVYLAMYMRTQPGLPAE 

MFRLRIGLIIQVMATELAHSLRCSAEEATEGL 

MNLSPSAMKNLLHHILSGKEFGVERSVRPTD 

SNVSPAISIHEIGAVGATKTERTGIMQLKSEIK 

QSPGTSMTPSSGSFPSAYDQQSSKDSRQGQW 

QRRRRLDGALNRVPVGFYQKVWKVLQKCH 

GLSVEGFVLPSSTTREMTPGEIKFSVHVESVL 

NRVPQPEYRQLLVEAIL\VLTMLADIEI\HSIGS 

ILAVEKIVHIANDLFLQEQKTLGADDTMLAKD 

PASGICTLLYDSAPSGRFGTMTYLSKAAATY 

VQEFLPHSICAMQ 


436 


1786 


A 


3546 


73 


393 


CP* LTWELLE VKKAE VLQDSLDG RYSTPSSCL 
EQPDSCRPYGRSFYALEEKHVIFSLDVGETDN 
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KGKGKTIRGI*TFKGRKGGTYQREHDANPLA 
PXSARSCWMRKG 


437 


1787 


A 


3554 


5157 


2939 


AVRAEPGLEELSSGLRAHSPSATTVCEPEAQG 

SASGCRYAAHPHWGLGGAAAAGGSWEPQPP 

RPVCEPAGRGKPHPPAAPRSPLLPGSRRRPHA 

AQPGARARTSPPPASARNMAARPAATLAWSL 

LLLSSALLREGCRARFVAERDSEDDGEEPWF 

PESPLQSPTVLVAVLARNAAHTLPHFLGCLER 

LDYPKSRMAIWAATDHNVDNTTEIFREWLK 

NVQRL YHYVEWRPMDEPES YPDEJ GPKH WP 

TSRF AHVMKLRQ AALRTAREKW SDY ILFID V 

DNFLTKPQTLNLLI AENKTIV APMLESRGLY S 

NFWCGITPKGFYKRTPD Y\ VQIRE WKRTGCFP 

VPMVHSTFLIDLRKEASDKLTFYPPHQDYTW 

TFDDIIVFAFSSRQAGIQMYLCNREHYGYLPIP 

LKPHQTLQEDIENLIH VQIE AM 1 DRPPMEPSQ 

YVSVVPKYPDKMGFDEIFMINLKRRKGQGGD 

RWLRTLYEQEIEVKIVEAVDGKALNTSQLKA 

LNIEMLPGYRDPYSSRPLTRGEIGCFLSHYSV 

WKEVIDRELEKTLVIEDDVRFEHQFKKKLMK 

LMDNIDQAQLDWELIYIGRKRMQVKEPEKA 

VPNVANLVEADYSYWTLGYVISLEGAQKJLV 

GANPFGKMLPVDEFLPVMYNKHPVAEYKEY 

YESRDLKAFSAEPLLIYPTHYTGQPGYLSDTE 

TSTIWDNETVATDWDRTHAWKSRKQSRIYSN 

AKNTEALPPPTSLDTVPSRDEL 


438 


1788 


A 


3563 


130 


527 


IFFNSSSLFCRVFCLFLR\VSFTLVAQARVQ*C 
NLSSLQPLPPGFK*FSCLSPPRS*DYRRPPPRPA 
NFLYF**RQGFTVLGQAGLELLT/S/GDPPTSA 
SQS AGiTGVSHRA WP VH AISTHI SL VKTRPSLT 
TLG 


439 


1789 


A 


3565 


446 


1834 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSY 

GQPSLQDELKDNTTVFTRILDRLLDGYDNRL 

RPGLGERVTEVKTDIFVTSFGPVSDHDMEYTI 

D VFFRQS WKDERLKFKGPMTVLRLNNLMAS 

KIWTPDTFFHNGKKSVAIiNMTMPNKLLRITE 

DGTLLYTMRLTVR\AECPMAFGRDFPM\D\AH 

ACPLKFGSYAYTRAEVVYEWTREPARSVVV 

AEDGSRLNQYDLLGQTVDSGIVQSSTGEYVV 

MTTHFHLKRKiGYFVlQTYLPCIMTVILSQVSF 

WLNRES VP ARTVFGVTTVLTMTTLSIS ARNSL 

PKVA YATAMD WFTA VC Y AFVFS AL IEF ATVN 

YFTKRGYAWDGKSVVPEKPKKVKDPLIKKN 

NTYAPTATSYTPNLARGDPGLATIAKSATIEP 

KEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLL 

FGIFNLVYWATYLNREPQLKAPTPHQ 


440 


1790 


A 


3568 


1 


350 


STSSCFPAAAAAIMREIVHLQAGQCGNQIGAK 
FWEVISDEHGIDPTGTYHGDSDLQLERINVYY 
NEATGE APVPSPT ALRGPRGPCLG* RPP VPAG 
GKYVPRAVLVDMEPGTMDSV 


441 


1791 


A 


3569 


2 


1751 


FVAVAGAVSGEPLVHWCTQQLRKTFGLDVS 

EEIIQYVLSIESAEEIREYVTDLLQGNEGKKGQ 

FIEELITKWQKNDQELISDPLQQCFKKDEILDG 

QKSGDHLKRGRKKGRNRQEVPAFTEPDTTAE 

VKTPFDLAKAQENSNSVKKKTKFVNLYTREG 

QDRLAVLLPGRHPCDCXGQKHKLINNCLICG 

PJVCEQEGSGPCLFCGTLVCTHEEQDH.RGDS 

KJvSQKLLKKLMSGVENSGKVDISTKDI.I.PH 

QELRIKSGLEKAIKHKDKLLEFDRTSIRRTQVI 
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DDESDYFASDSNQWLSKLERETLQKREEELR 

E L RHAS RLS KK. VTIDF AGRKDL EEEN SL AE YH 

SRLDETIQAIANGTLNQPLTKLDRSSEEPLGVL 

VNPNMYQSPPQWVDHTGAASQKJCAFRSSGF 

GLEFNSFQHQLRIQDQEFQEGFDGGWCLSVH 

QPW A SLL VRGIKRVEGRS WYTPHRGRL WIAA 

TAKKPSPQEVSELQATYRLLRGKDVEFPNDY 

PSGCLLGCVDLIDCLSQKQFKEQFPDISQESDS 

PFVHCKNPQEMWKFPIKGNPKJWKLDSKIH 

QGAKKGLMfCQNKAV 


442 


1792 


A 


3576 


1 


2019 


MPRSHTGERLCEGKEGSQCAENFSPNLSVTK 

KTAGVKPYECnCGKAFMRLSSLTRHMRSHT 

AIRAI\EKPYKCKEC\GRAI : SLSQ1LSK\HF.RSH 

TGEKPYKCKQCGKTFIYHQPFQRHERTHIGEK 

PYECKQCGKALSCSSSLRVHERIHTGEKPYEC 

KQCGKAFSCS S SIRVHERTHTGEKP Y ACKVEC 

GKAFIS\TTSVLTHMITHNGDRPYKCKECGKA 

FIFPSFLRVHERIHTGEKPYKCKQCGKAFRWS 

TSIQIHERIHTGEKPYKCKECGKSFSARPAFRV 

HVRVHTGEKPYKCKECGKAFSRISYFRIHERT 

HTGEKPYECKXCGKTFNYPLDLKIHKRNHTG 

KCRDCGKVFIFPSALRTHERTHTGEKPYECK.Q 

CGKAFSCSSYIRIHKRTHTGEKVPYECKECGK 

AFIYPT SFQGHMRMHTGEKPYKCKECGKAFS 

LHSSFR\RHTRJHNYEKPLEC*Q\CGKAFSVSTS 

LKKPMRNAQSDRKLY/KCEK*EKVFNSNRCF 

QSCENSH*REKSCQCK*YRKRDTR , FMYSQV 

PHNHV S VSNGPYR/CGSPrRL YNT*NISrNRNL 

VAVVTP* CSTLFKCLWCWCKRAALS W*/IVQ 

DSGRGRWLTPVrPALWEAKAGGSRGQEIKTIL 

ANTVKPHLY 


443 


1793 


A 


3578 


287 


114 


DFYFRKFEQFEEGHKQIVNKWRDLLCSWKRK 
LSnKKSVLQNNL*FSAASMRFQKVFF 


444 


1794 


A 


3582 


3335 


1909 


HLFFSLFLAAMAMTGSTPCSSMSNHTKERVT 

MTKVTLENFYSNLIAQHEEREMRQKKLEKV 

MEEEGLKDEEKRLRRSAHARKETEFLRLKRT 

RLGLEDFESLKVIGRGAFGEVRLVQKKDTGH 

VTAMKn-RKADMLEKEQVGHIRAERDILVEA 

DSL WWKMFY SFQDKLNLYLIMEFLPGGDM 

MTLLMKKDTLTEECTQFYIAETVLAIDSIHQL 

GFIHRDIKPDNLLLDSKGHVKLSDFGLCTGLK 

KAHRTEFYRNLMiSLPSDFTFQ>TMNSKRKAE 

TWKRNRRQLAFSTVGTPDYIAPEVFMQTGYN 

JCLCDWWSLGVIMYEMLIGYPPFCSETPQETY 

KKVMNWK£TL1TPPEVPISEKAKDLLLRFCCE 

WEHRI G APG VEE I K SN SFFEG VD WEHIRERPA 

AISIEIKSIDDTSNFDEFPESDILKPTVATSNHPE 

TDYKNKDWVFrNYTYKRPEGLTARGAlPSYM 

KAAK 


445 


1795 


A 


3584 


1 


6169 i RTRGI EKRFA Y SFLQQL1RY VDE AHQ Y 1LEFD 
' GGSRGKGEHFPYEQETKFFAKWLPLIDQYFK 
i NHRLYFLSAASRPLCSGGHASNKEKEMVTSL 
1 FCKLG VL VRHRI SLFGNDATSrVNCLHILGQT 
I LDARTVMKTGLESVKSALRAFLDNAAEDLE 
i K.TME>OCQGQFTHTRNQPKGVTQirNYTTVA 
! LLPMLSSLFEHlGQHQFGEDLILEDVQVSCYRi 
1 LTSLYALGTSKSIYVERQRSALGECLAAFAGA 
1 FPVAFLETHLDKHNTYSIYNTKSSRERAALSLP 
1 TtWEDVCPNIPSLEKLMEElVELAESGIRYTQ 
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D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Oiycine, H=Histidine, 
l=Iso leucine, K>Lysine, L=Leucine, 
M=Methioninc, N=-Asparagine, P=Prolinc, 
Q=Glutamine, R=Arginine, S^Serine; 
T=Threonine, V= Valine, W=Tryptophan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-=possible 
nucJcotide insertion 














MPHVMEVILPMLCSYMSRWWEHGPENNPER 

A£MCCTALNSEHMNTLLGNILKIIYNNLGIDE 

GAV/MKRLAVFSQPIINKVKPQLLKTHFLPLM 

EKLKKKAATWSEEDHLKAEARGDMSEAEL 

LILDEFTTLARDLYAFYPLLIRFGDYNRAKWL 

KEPNPEAEELFRMVAEVFtYWSKSMNFKREE 

QNFWQNEINNMSFLITDTKSKMSKAAVSDQ 

ERKKMKJ^GDRYSMQTSLIVAALXRLLPIGL 

NICAPGDQELIALAKNRFSLKDTEDEVRDIIRS 

NIHLQGKLEDPAIRWQMALYKDLPNRTDDTS 

DPEKTVERVLD1ANVLFHLEQKSKRVGRRHY 

CLVEHPQRSKKAVWHKLLSKQRKRAWACF 

RMAPLYNLPRHRAVmFLQGYEKSWIETEEH 

YFEDKLIEDLAKPGAEPPERDEGTKRVDPLHQ 

LILLFSRTALTEKCKLEEDFLYMAYADIMAKS 

CrIDEEDDDGEEEVKSFEEKEMEKQKLLYQQ 

ARLHDRGAAEMVLQTISASKGETGPMVAAT 

LKLGIAILNGGNSTVQQKMLDYLKEKKDVGF 

FQSLAGLMQSCSVLDLNAFERQNKAEGLGM 

VTEEG SGEKVLQDDEFTCDLFRFLQLLCBGH 

NSDFQNYLRTQTGNNTTVNII1STVDYLLRVQ 

ESISDFYWYYSGKDVIDEQGQRNFSICMQVA 

KQVFNTLTE YIQGPCTGNQQSL AHSRL WD A V 

VGFLHVFAHMQMKLSQDSSQIELLKELMDLQ 

KDMWMLLSMLEGNVVNGTIGKQMVDMLV 

ESSNNVEM1LKFFDMFLKLKDLTSSDTFKEYD 

PDGKGVIFKRDFHKAMES HKH YTQS ETEFLL 

SCAETDENETLDYEEFVKRFHEPAKDIGFNVA 

VLLTNLSEHMPNDTRLQTFLELAESVLNYFQP 

FLGRIE1MGSAKRIERVYFEISESSRTQWEKPQ 

VKESKRQFIFDWNEGGEKEKMELFVNFCED 

TIFEMQLAAQ1SESDLNERSANKEESEKERPEE 

QGPRMAFFSILTVRSALFALRYN1LTLMRMLS 

LKSLKKQMKKVKKMTVKDM VTAFFS SYWS1 

FMTLLHFVASVFRGFFRIICSLLLGGSLVEGA 

KKJKVAELLANMPDPTQDEVRGDGEEGERKP 

LEAALPSEDLTDLKELTEESDLLSDIFGLDLKR 

EGGQYKLIPHNPNAGLSDLVSNPVPMPEVQE 

KFQEQKAKEEEKEEKEETKSEPEKAEGEDGE 

KEEKAKEDKGKQKLRQLHTHRYGEPEVPESA 

F V/KXI IA YQQKLLNYF ARNFYNMRMLALF V 

AFAINFrLLFYKVSTSSWEGKELPTRSSSENA 

KVTSLDSSSHRJIAVHYVLEESSGYMEPTX'RIL 

PILHTVISFFCIIGYYCLKVPLVIFKREKEVARK 

LEFD GL Y1TEQP SEDDIKGQ WDRLVIKrQ SFP 

NN Y WDKF VKRK VMDKYGEFYGRDRI SELLG 

MDKAALDFSDAREKKKPKKDSSLSAVLNSrD 

VKYQMWKLGVVFTDNSFLYLAWYMTMSVL 

GHYXKNFFFAAHLLDIAMGFKTLRTTLSSVTH 

NGKQLVLWGfXAVVVYLYTVVAFNFFRKF 

YNKSEDGDTPDMKCDDMLTCYMFHMYVGV 

RAGGGIGDEiEDPAGDEYErYRIIFDrrFFFFVl 

VILLAIIQGLIIDAFGELRDQQEQVKEDMETKC 

FICG1GNDYFDTVPHGFETHTLQEHNLANYLF 

FLMYLINKDETEHTGQESYVWKMYQERCWE 

FFPAGDCFRKQYEDQLN 


446 


1796 


A 


3592 


1 


355 


AGLELLNSDDPPALASQSAGITGVTRTPSLFF* 
DTVLLCCSGWSAVAPSRLTAALFS*AQAVCL 
SLPRSWDYRRW/PPHPANFCIFCRDE/SLA/ML 
PRLVSNSWTQAILLPRPPKMLGLQV 
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Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Pheny] alanine, G=Glycine, H=Histidine, 
I=Iso leucine, K=Lysine, L^Leucine, 
M=Methionine, N=*Asparagine, P=Proline, 
Q-Glutamine, R=Arginine, S=Serine, 
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Y=Tyrosine, X=Unknown, *=Stop cod on, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


447 


1797 


A 


3598 


1202 


1070 


LFVGGGPICPEGASGFAPGPAPAPRVGVDAEV 
GR*V*GAAASQGA/GSLRPRPTGPGHPGAWL 
Q VWGAAAVCAGP AM * /AVRAKRGPRAG* EP 
NSPWRSGVLAA\RAVGAGPWP*P*PGCS*ARG 
PSSRSAPGLASGPAAPLLQGVHSSAGPLLCYI 
NGTLALGLKP* * AWGWGEWRPK.G 


448 


1798 


A 


3604 


3115 


557 


FRRKGGGGPKDFGAGLKYNSRHEKVNGLEE 

GVEFLPVNNVKKVEKHGPGRWVVLAAVLIG 

LLLVLLGIGFLVWHLQYRDVRVQKVFNGYM 

RITNENFVDAYENSNSTEFVSLASKVKDALKL 

LYSGVPFLGPYHKESAVTAFSEGSVIAYYWSE 

FSIPQHLVEEAERVMAEERVVMLPPRARSLKS 

FWTSWAFPTDSKTVQRTQDNSCSFGLHAR 

GWLMRFTTPGFPDSPYPAHARCQWALRGD 

ADSVLSLTFTISFDLASCDERGRHLV\TVYNT\L 

SPMEPHAVL VQLCGTYPPS YNLTFH S\S\QNVL 

LrrLITNTERRFIPGVFEATFFQLPRMSSCGGRL 

RKAQGTFN S P YYPGHYPPN I DCT WNIE V PNN 

QHVKVRFKFFYLLEPGVPAGTCPKDYVEING 

EKYCGERSQFWTSNSNK1TVRFHSDQSYTDT 

GFLAEYLSYDSSDPCPGQFTCRTGRCIRKELR 

CDGWADCTDHSDELNCSCDAGHQFTCKNKF 

CKPLFWVCDSLNDCGDNSDEQGCSCP\AQTF 

RCSNGKCLSKS QQCNGKDDCGDGSDE ASCP 

KVNWTCTKHTYRCLNGLCLSKGNPECDGK 

EDCSDGSDEKDCDCGLRSFTRQARWGGTD 

ADEGEWPWQVSLH ALGQGHICGA SI J SPNWI . 

VSAAHCYIDDRGFR YSDPTQ WTAFLGLHDQS 

QRS APG VQERRLKRIIS HPFFNDFTFD YDIALL 

ELEKPAEYSSMVRPICLPDASHVFPAGKAIWV 

TGWGHTQYGGTGALILQKGEIRVINQTTCEN 

LLPQQITPRMMC VGFLSGG VDSCQGDS GGPL 

S SVEADGRIFQAG W S WGDGC AQRNKPGVY 

TRLPLFRDWIKENTGV 


449 


1799 


A 


3618 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQ 

EMTRRPSLMAGRQHGWSAQQSATVANPVPG 

ANPDLLPHFL GEPEDV YI VKNKP VLL VCKA V 

PATQIFFKCNGEWVRQVDHVIERSTDGSSGLP 

TME VRINVSRQQ VEKVFGLEEY WCQC V A WS 

SSGTTKSQKAY1RIAYLRKNFEQEPLAKEVSL 

EQGIVLPCRPPEGIPPAE 


450 


1800 


A 


3620 


1 


2676 


MEPSLGQGMDLTCPFGVSPACGAQASWSIFG 

ADAAEVPGTRGHSQQEAAMPHIPEDEEPPGE 

PQAAQSPAGQQGPPTAGVSCSPTPTIVLTGDA 

TSPEGETDKNLANRVHSPHKRLSHRHLKVST 

ASLTSVDPAGHIIDLVNDQLPDISISEEDKKKN 

LALLEEAKLVSERFLTRRGRKSRSSPGDSPSA 

VSFNLSPSASPTSSRSNSLTVPTPPEGDEADVS 

SPHPGEPNVPKGLADRKQNDQRKVSQGRLAP 

RPPPWXSKEIA1EQKENFDPLQYPETTPKGLA 

PVTNS SGKMALNSPQPGPVESELGKQLLKTG 

WEG SPLPRSPTQD AAGVGPPASQGRGPAGEP 

MGPEAGSKAELPPTVSRPPLLRGLSWDSGPEE 

PGPRLQKVLAKLPLAEEEKRFAGKAGGKLAK 

APGLKDFQIQVQPVRMQKLTKLREEHILMRN 

QNLVGLKLPDLSEAAEQEKGLPSELSPAIEEE 

ESKSGLDVMPNISDVLLRKLRVHRSLPGSAPP 

LTEKEVENVFVQLSSAFRNDSYTLESRINQAE 

RERKLTEENTEKELENFKASITSSASLWHHCB 

HRETYQKLLEDIAVLHRL.AARLSSRAEWGA 
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Amino acid sequence (A^ Alanine C=Cysteinc, 
D=Aspartic Acid, E^GIutamic Acid, 
F^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine f L=Leucine, 
M=Methionine, N=Asparagine, P^ProIine, 
Q-Glutamine, R-Arginine, S-Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/=possibJe nucleotide deletion, \=possible 
nucleotide insertion 














VR QEKRM SKATE VMMQ Y VBNLKRT YEKDH 

AELMEFKKLANQNSSRSCGPSEDGVLRTARS 

MSLTLGKNMPRRRVSVAWPKFNALNLPGQ 

TPSSSS1PSLPALSESPNGKGSLPVTSALPALLE 

NGK.TNGDPDCEASAPALTLSCLEELSQETKA 

RMEEEAYSKGFQEGLKKTKELQDLKEEEEEQ 

KSESPEEPEEVEETEEEEKDPRSSKLEELVHFL 

QVMYPKLCQrWQVIWMMAAVMLVLTVVL 

GLYNSYNSCAEQADGPLGRSTCSAAQKDSW 

WSSGLQHEQPTEQ 


451 


1801 


A 


3623 


504 


198 


QLIQHQTVHTGRKLYECKECGKAFNQGSTLI 
RHQRIHTGEKPYECKVCGKAFRVSSQLKQHQ 
RIHTGERPYQCKELKGRGAEMLAVLAVKEQ 
NRTPVNYGK 


452 


1802 


A 


3628 


2 


195 


MTCLHSAKAFHY'SSCSFSCEEGFALIGPEVV 

QCTALGVWTAPAPVCIAVQCQHLEALNEGT 

MG*DYPFTAFAYGSSCKYECHTVYRVRGLD 

MLHSRGCYLWNGHFTT* E AI SCEPLERPCH* S 

V*CSFSCEEGFALIGPEWQCTALGVWTAPAP 

VCIAVQCQHLEALNEGTMG 


453 


1803 


A 


3637 


662 


142 


IQAKGLGIWHVPNfCSPMQHWRVKGSLLRYRT 

DTGFLQTLGHNLLGIYQKYPVKYGEGKCWT 

DNGPVIPVVYDFGDAQKTASYYSPYGQREFT 

AGF VQFRVFNNERAANALCAGMRVTG CNTE 

HHCIGGGGYFPEASPQQCGDFSGFDWSGYGT 

\HVGYSSSREITFAAAVLLFYR 


454 


1804 


A 


3641 


1 


362 


TQVHPAMLGLDELGRSGCGHCTQADLRFGD 
AAGRDPGQDNDRNTAEPAFPPPPRVMAAAA 
ALRAPAQSSVTFEDVAVNFSLEEWSLLNEAQ 
GCLYHDVMLETLTLISSLGKVLILNCDLS 


455 


1805 


A 


3646 


2 


414 


AAAGRGASGALTGEGGGEQGRRVGLGSRAH 
SLLL GPTFNSCQVSS QPPRV AGLGLPL KJJEPS 
RPQPPSPRGPRTVRAGVPGAHPQDTPCPEFVR 
PRKVPLVGEAPGLPPEERSRGWRRDTPGLQE 
SRVRAPSYDDIT 


456 


1806 


A 


3656 


396 


8 


QP/Sr^SYLTLYTKNNLKSMKDLNVNTEMIK 
LLELKNIHNLG* AKFFLN* IQKALIKRKILIH W 
P/'LIKrK/SFCSLSDTIKKMKRQTIVWEQTFIIHr 
SVKELVSRIYEAFLQFNXTVNRPVFDIKKEQK 
F 


457 


1807 


A 


3660 


14 


1961 


SEAKLGGPTGMDLWQLLLTLALAGSSDAFSG 

SEATAAILSRAPWSLQSVNPGLKTNSSKEPKF 

TKCRSPERETFSCHWTDEVHHGTKNLGPIQLF 

YTRRNTQEWTQEWKECPDYVSAGENSCYFN 

SSFTSIWTPYCIKLTSNGGTVDEKCFSVDEIVQ 

PDPPIALNWTLLNVSLTGIHADIQVRWEAPRN 

ADIQKGWNfVUEYELQYKEVNETKWKMMDP 

ILTTSVPVYSLKVDKEYEVRVRSKQRNSGNY 

GEFSEVLYVTLPQMSQFTCEEDFYFPWLLIITF 

GIFGLTVMLFVFLFSKQQRJKMLILPPVPVPKI 

KGIDPDLLKEGKLEEVNTILAIHDSYKPEFHS 

DDSWVEFIELDIDEPDEKTEESDTDRLLSSDH 

EKXHINLGVKDGDSGRTSCCEPDILETDFNAH 

DIHEGTSEVAQPQRLKGEADLLCUXJKNQNN 

SPYHDACPATQQPSVIQAEKNKPQPLPTEGAE 

STHQAAHIQLSNPSSLSNIDFYAQVSDITPAGS 

WLSPGQKNKAGMSQCDMHPEMVSLCQENF 

LMDNAYFCEADAKKCIPVAPHIKVESH1QFVS 

LNQEDmTTESLT\TAAGSP\GTGEHVPGSEM 
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I=Iso leucine, K.=Lysine, L—Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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T— Threonine, V— Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














PVPDYTSIHI VQ SPQGLILNAT ALPLPDKEFLS 
SCGYVSTDQLNKIMP 


458 


1808 


A 


3663 


154 


462 


TRAPASGRSGAGLALSANAPDSGGHPGATEG 
P AG SL AHASGS ARGTWRVRGRGSHG WERTV 
GAGGCANPVPALHSCASAPRGTGRVSALGPK 
TGSSPLSSPKG 


459 


1809 


A 


3664 


902 


135 


LGK YNTSMALFDFVLHN STGEIRYITEDDVIQ 

S QN ALG K YNT SMALFESN SFEKTEL ESP YYVD 

LNQTLFVQVSLHTSDPNLVVFLDTCRASPTSD 

FASPTYDL1KSGCSRDETCKWYPLFGHYGRF 

QFNAFKFLRSMSSVYLQCKVLICDSSDHQSRC 

\NQGCVSRSKRDISSYKWKTDSUGPIRLKRDR 

SA\NGNSGFQHETHAEETPNQPFNSVHLFSFM 

VLALNV VW ATITVRHFVN QRAD YQ\Y QKL Q 

NY 


460 


1810 


A 


3670 


850 


557 


LGILMSPQVEAGE1*ALLTPPPGCMQFSPLTL/P 
K* VWSPGLTP/PPPEVPSVFLVEPGLPHAGQA 
GLDLL\TSGDPPASTSQSARTTDVSHRAQPLAI 
S 


461 


1811 


A 


3671 


2472 


2099 


IGVLAFETG S C S VTRLYCIGIIMPHC SLDL AGS\ 
TSAFR1AGTTSVHHHPQLTFFFFWIETGSHCV 
VQTGL* LL ALSNPPALASQI AGI S GM SHRAWP 
GLVLYSLEFSLLCASQSUMLFTCYNE 


462 


1812 


A 


3672 


394 


110 


VKPVNGESKRD*GADTQTCEGEADEQLQT\N 
C Y YD/STKSFF YI SCG* KXRKPT W AENRRLN A 
KMFGIPLHSNSDPWGYEEREVIGFHKSRVSRG 
HGS 


463 


1813 


A 


3673 


348 


1 


QRNPF S AGHPQRPPTSGSQSELL AQPRLRPGR 

KSSFSRDQDVW* SQAVPKRQ* QRNPFS AGHP 

QRPPTSGSQSELLAQPRLRPGRKSSFSRDQDV 

WPGQKPRPSQQQHQMCASPTLGQRSPFALEP 

VPAYHGGRDPFASARPSPVGIPKPRAAPAGG 

GWRRIRPKSSTK 


464 


1814 


A 


3676 


2253 


320 


PVIQRCSQPYGFSLHSFFLKCVSETSQQPPSR 

KVFQLLPSFPTLTRSKSHESQLGNRIDDVSSM 

RFDLSHG SPQM VRRDIGLS VTFIRFSTXS WLS 

QVCHVCQK\SMIFGVKCKHCm,KCMNKCTKE 

APACRISFLPLTRLRRTESVPSDINNPVDRAAE 

PHFGTLPKALTKKEHPPAMNHLDSSSNPSSTT 

FSTPSSPAPFPTSSNPSSATTPP\NPSP\GQR\DSR 

FNFPSC/AYFEHHR\Q\QFIFPDISAFAHAAPLPE 

AADGTRLDDQPKADVLEAHEAEAEEPEAGK 

SEAEDDEDEVDDLPSSRRPWRGPISRKASQTS 

VYLQEWDIPFEQVELGEP1GQGRWGRVHRGR 

WHGEVAIRLLEMDGHNQDHLKLFKKEVMN 

YRQTRKENVVLFMGACMNPPHLAIITSFCKG 

RTLHSFVRDPKTSLDINKTRQIAQEIIKGMGY 

LHAKGIVHKDLKSRNVFYDNG\KVVITDFGLF 

\GISGVVP\EGRRENQLKXSHDWLCYLAPEIVR 

EMTTGKDEDQLPFSKAADVYAFGTVWYELQ 

ARDWPLKNQAAEASIWQIGSGEGMKRVLTS 

VSLGKEVSENLSACWAFDLQERPSVFSLXMD 

MLEKLPKLNRRLSHPGHF*KSADINSSKWPR 

FERFGLGVLESSNPKM 


465 


1815 


A 


3679 


8 


803 


IPSPAWWNSTWADTFSLLLALAVAEYLGYY 

WACVLQTHRAPCASNTEDLETVVNH1KHRYP 

QAPLLAVGISFGGILVLNHLAQARQAAGLVA 

ALTLSACWDSFETTRSLETPLNSLLFNQPLTA 

GI.CQLVERrSY/F*DI,QARTIRQFDFRYTSVA 
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/—possible nucleotide deletion, possible 
nucleotide insertion 














FGYQDCVTYYKAASPRTKIDAJRIPVLYLSAA 
DDPFSTVCALPKQAAQHSPYVALLITARGGHI 
GFLEGLLPWQHWYMSRLLHQYAKAIFQDPE 
GLPDLRALLPSEDRNS 


466 


1816 


A 


3684 


3 


307 


SSQY1 VQSKTKIFL* AAREKQ/RHTCRRF SIRLS 
AN1SSQTGEARGQWPSVFKVLKEKKLSTKKS 
FGQK*GR\RKTFPDKQK7IJtEFDTTRPTIQEML 
TGVLQG 


467 


1817 


A 


3687 


2465 


837 


ELPTPLI AAHQL YN Y V ADHAS S YHMKPLRMA 

RPGGPEHNEYALVSAWHSSGSYLDSEGLRHQ 

DDFDVSLLVCHCAAPFEEQGEAERHVLRLQF 

FWLTSQREI -FPRT TADMRRFRKPPRLPPEPE 

APGSSAGSPGEASGLILAPGPAPLFPPLAAEVG 

MARARLAQLVRLAGGHCRRDTLWKRLFLLE 

PPGPDRLRLGGRLALAELEELLEAVHAKSIGD 

1DPQLDCFLSMTVSWYQSLIKVLLSRFPQSCR 

HFQSPDLGTQYLWLNQKFTDCFVLVFLDSH 

LGKTSLTWFREPFPVQPQDSESPPAQLVSTY 

HHLESVINTACFTLWTRLL*GSGLDH*MSLFL 

ESWAYQIACQRQD*PALLGPRASQTLSDTKG 

FVTMS*GSAAPAWQQEPPSPNTHSH*PIQDSR 

ESGQPRGPLGPFWGTPFGPPGRVSGVHTGWQ 

TPPRAPLPESCPL\PLTTVSHLCPLSLRVFTSHL 

DTTAGHSHRDDTW VPTPALPLKHLRPPS SPF A 

LGPWVSHPLMRWVQKLSHLHSNPGTGFSMG 

GKQQRN 


468 


1818 


A 


3691 


960 


499 


QTCRKDKRAIYPHFQNE* MNELKAI* SGTGGI 
QCFHSONDSAFFFFLFLLETEFCSAA/TVQWH 
DFLSMQPPPPGFKQFTCLSLLSSWNYRRXPPPF 
PGNF\* FL VKTGFPHVGQTGFELLTSSDL APL A 
SQNGGITGMSPCAWPFFFFFFFGLC 


469 


1819 


A 


3714 


4747 


495 


MAY S WQTDPNPNESHEKQYEHQEFLFVNQP 

HSSSQVSLGFDQIVDEISGKIPHYESEIDENTFF 

VPTAPKWDSTGHSLNEAHQISLNEFTSKSREL 

SWHQVSKAPAIGFSPSVLPKPQNTNKECSWG 

SPIGKHHGADDSRFSILAPSFTSLDKINLEKEL 

ENENHNYHIGFESSIPPTNSSFSSDFMPKEENK 

RSGHVNIVEPSLMLLKGSLQPGMWESTWQK 

NIESIGCSIQLVEVPQSSNTS^ASFCNKVKJOR 

ERYH AAD VNFNSGKI W STTTAFP YQLFSKTK 

FNTHIFIDNSTQPLHFMPCANYLVKDLIAEILH 

FCTNDQLLPKDHILSVWGSEEFLQNDHCLGS 

HKMFQKDKS VIQL HLQKSREAPGKLSRKHEE 

DHSQFYLNQLLEFMHIWKVSRQCLLTL1RKY 

DFHLKYLLKTQENVYNIIEEVKKICSVLGCVE 

TKQITDAVNELSLILQRKGENFYQSSETSAKG 

LIEKVTTFI „STSrYQLINVYCNSFYADFQP VNV 

PRCTSYLNPGLPSHLSFTVYAAHNIPETWVHR 

INFPLEIKSLPRESMLTVKLFGIACATNNANLL 

A WTC LPLFPKEKSILG SMLFSMTLQSEPPVEM 

ITPGV WD VSQPS P VTLQIDFPATGWE YMKPD 

SEENRSNLEEPLKECIKHIARLSQKQTPLLLSE 

EKKRYLWFYRFYCNNENCSLPLVLGSAPGW 

D ERTVSEMHTILRR W TFS Q PL EALGLLTS S FP 

DQEIRKVAVQQLDNLLNDELLEYLPQLVQAV 

KFEWNLESPLVQLLLHRSLQSIQVAHRLYWL 

LKN AENE A YFK SW YQKLLAALQFC AGKALN 

DEFSKEQKLIKILGDIGERVKSASDHQRQEVL 

KKEIGRLEF.FFQDVNTCHLPLNPALCTKGIDH 

DACSYFTSNALPLKJTFINANLMGK.NISIIFKA 
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GDDLRQDMLVLQLIQVMDNIWLQEGLDMQ 

Mil YRCL STGKDQ RL VQMVPDA VTLAKIHRH 

SGLIGPLKENTIKKWFSQHNHI.KADYFKALR 

NFFYSCAGWCWTFILGVCDRHNDNIMLTKS 

GHMFHDDFGKFLGHAQTFGGIKRDRAPFIFTS 

EM\EYFITEGG\KNPQHFQDFV\ELCCRAYNIIR 

KHSQLLL\NLL\EMML YAG\LPELSGI\QDL KY 

VYNNLRPQDTDLEATSHFTKKIKESLECFPVK 

LNNLIHTLAQMSAISPAKSTSQTFPQESCLLST 

TRSIERATILGFSKKSSNLYLIQVTHSNNETSL 

TEKSFEQFSKLHSQLQKQFASLTLPEFPHWW 

HLPFTNSD HRRFRD LNHYMEQILNV S HE VTN 

SDCVLSFFLSEAGQOTVEESSPVYLGEKFPDK 

KPKVQLVISYEDVKLTILVKHMKNIHLPDGSA 

PSAHVEFYLLPYPSEVRRRKTKSVPKCTDPTY 

NEIVVYDEVTELQGHVLMLIVKSKTVFVGAI 

NIRLCSVPLDKEKWYPLGNSU*PLLLFSSFGM 

KSLEKDEFVGGMLLSNPIW 


470 


1820 


A 


3718 


430 


75 


SHG S1SILNLHQGCVFLPSLPAQGLRC YRCL A 
VLEG ASC S WSCPFLDGVCVSQK VS V/C WQ*/ 
CPWGARAEGRLSAWDSQISCCKGDLCNAV 
VLAAGSP W ALCVQLLL SLG S VFLW ALL 


471 


1821 


A 


3723 


891 


494 


LRQSL/NSVPQAGVQWRDSSLQAPPPRFTPLS 
CLSLPSS WDYRRLPPCLANFLYF* * RRGFTML 
ARMVLIS*PRDPPASASQ\STEITGGSHRAQHP 
TDSRDHSERSVKKSHEVISELRMKVIKCKVAF 
SKNPI 


472 


1822 


A 


3734 


443 


251 


GFIET*NFCVSKDTSKKLS/RLPTKWKNVFAN 
* ISDKGLVSRICQELLRHLD AEQ VS STAGL SL 


473 


1823 


A 


3746 


3 


500 


THASGGARSGAGWAGRGVRAGTEAGRGGIF 

LTLSILRTRDLPS GAMSEG VDLIDI Y ADEEFNQ 

DPEFNNTDQIDLYDDVLTATSQPSDDRSSSTE 

PPPPVRQEPSPKPNNKTPAILYTYSGLRNRRA 

AVYVGSFSWWTTDQQLIQVIRS1GVYDVGEV 

KFAENRAK 


474 


1824 


A 


3753 


2 


5262 


RPLF AREG GI YA VLVC MQEYKTS V\L VQQ AG 

LAALKMLAVASSSEIPTFVTGRDSIHSLFDAQ 

MTREIFASEDSATRPGSESLLLTVPAAVILMLN 

TEGCSSAARNGLLLLNLLLCNHHTLGDQ1ITQ 

ELRDTLFRHSGLAPRTEPMPTTRTILMMLLNR 

YSEPPGSPXERAALETPHQGQDGSPELLIRSLV 

GGPSAELLLDLERVLCREGSPGGAVRPLLKRL 

QQETQPFLLLLRTLDAPGPNKTLLLSVLRVIT 

RIXDFPEAMVLPWHEVLEPCLNCLSGPSSDSE 

rVQELTCFLHRLASMHKDYAVVLCCLGAKEI 

LSKVLDKHSAQLLLGCELRDLVTECEKYAQL 

YSNLTSSILAGCIQMVLGQ1EDHRRTHQPINIP 

FFD VR.RHL CQGSS VE VKEDKC WEKVEV S SN 

PHRASKLTOHNPKTYWESNGSTGSHYITLHM 

HRGVLVRQLTLLVASEDSSYMPARWVFGG 

DSTSC1GTELNTVNVMPSASRVTLLENLNRFW 

PIIQIRIKRCQQGGIDTRVRGVEVLGPKPTFWP 

LFREQLCRRTCLFYTIRAQAWSRDIAEDHRRL 

LQLCPRLNRVLRHEQNFADRFLPDDEAAQAL 

GKTCWEALVSPLVQNITSPDAEGVSALGWLL 

DQYLEQRETSRNPLSRAASFASRVRRLCHLL 

VKVEPPPGPSPEPSTRPFSKNSKGRDRSPAPSP 

VLPSSSLRNITQCWLSWQEQVSRFLAAAWR 

AFDFVPRYCKL YEHL QRA GSELFGPRAAFMI , 
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475 












ALRSGFSGALLQQSFLTAAHMSEQFARYIDQ 

QIQGGUGGAPGVEMLGQLQRHLEPIMVLSG 

LELATTFEItFYQHYMADRLLSFGSSWLEGAV 

LEQIGLCFPNRLPQLMLQSLSTSEELQRQFHLF 

QLQRLDKLFLEQEDEREKRL*EEEEEF,FEEEA 

EKELFIEDPSPAISILVLSPRCWPVSPLCYLYHP 

RKCLPTEFCDALDRFSSFYSQSQNHPVLDMG 

PHRRLQWTWLGRAELQFGKQILHVSTVQMW 

LLLKFNQTEEVSVETLLKDSDLSPELLLQALV 

PLTSGNGPLTLHEGQDFPHGGVLRLHEPGPQ 

RSGEALWLIPPQAYLNVEKDEGRTLEQKRNL 

LSCLLVRILKAHGEKGLH1DQLVCLVLEAWQ 

KGPNPPGTLGriTVAGGVACTSTDVLSCILHLL 

GQGYVKRRDDRPQILMYAAPEPMGPCRGQA 

DVPFCGSQSETSKPSPEAVATLASLQLPAGRT 

MSPQEVEGLMKQTVRQVQETLNLEPDVAQH 

LLAHSHWGAEQLLQSYSEDPEPLLLAAGLCV 

HQAQAVPVRPDHCPVCVSPLGCDDDLPSLCC 

MHYCCKSCWNEYL.TTRIEQNLVLNCTCPIAD 

CPAQPTGAFIRAJVSSPEVISKYEKALLRGYVE 

SCSNLTWCTNPQGCDRILCRQGLGCGTTCSK 

CG WA SCFKCSFPEAH YPASCGFIMSO W VDDG 

GYYDGMSVEAQSKHLAKLISKRCPSCQAPiE 

KNEGCLHMTCAKCNHGFCWRCLKSWKFNH 

KDYYNCSAMVSKAARQEKRFQDYNERCTTH 

HQAREFAVNLRNRVSAIHEVPPPRSFrFLNDA 

CQGLEQARKVLAYACVYSFYSQDAEYMDVV 

EQQTENLELHTN ALQILLEETLLRCRDLAS SL 

RLLRADCLSTGMELLRRIQERLLAILQHSAQD 

FRVGLQSPSVEAWEAKGPNMPGSQPQASSGP 

EAEEEEEDDEDDVPEWQQDEFDEELDNDSFS 

YDESENLDQETFFFGDEEEDEDEAYD 


1825 


A 


3754 


1093 


96 


GTSRNQHSPKTHA*RS S/WPQPPPLFLPPLQPQ 

ATGRRRRRTRTQQRTAALLTDGTTKTGAAW 

SRRPSLCWPSRTTGAPGAK*AVLVRSA 1P1 1 N 

PPNPQSPTGAAGKJLRAPGNRAG/SEPSSQEPPP 

DGTR\RPASITGVAQSPATRATPSLPCLHVPAP 

SRGQTLGVRTTGRASRLTVDRSRLS WPGRSA 

RSGGGRWRPNAPRGRWPRAP+ SWEPGSWTE 

PWRWPFPAAESPPHRCIYCTNHVSPAGPARPS 

HVYHRATTNSISHPLCRAQSSPWEAAGVWRR 

PAQPAPTSDVNINLLRKPRVKRHDLIYQFLGN 

TLWEEGRQRPPETLQPAR 


476 


1826 


A 


3758 


901 


521 


FFFGNGVSPCPQAGV*WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVPPRQANFCIF/M*RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFAPKGCLPRQKEGGTLNLI 


477 


1827 


A 


3761 


843 


575 


GVISAHCNLRL/CHLPGSSNSPASASQVAGTIG 
ARTTPS* IFVFLVETGFHHVSQDGLDLL/NFV1 
RPRRPLKVLGLQACTRARLPSPLKEL 


478 


1828 


A 


3763 


267 


1240 


HLLSFHLWSASX^DCLEQLSQERHVKGNCLLGP 

PPVNESTKPSPSPWKLTPPMCSrPPVFPPKSGS 

PTTSWSyPSGHSKLEVERAQTGPFCLHIYCP*P 

GVTDNTTSLLHY1PFPRL\SGLVCFPAH*FPSY 

WTGHSFASQAWLRQVPEVSKHLOCPSAESLL 

TMEYHQPEDPAPGKAGTAEAVIPENHEVLAG 

PDEHPQDTD ARD ADGEAREREP/RRPSF AA* P 

VWGQPXESPLPEASSAPPGPTLGTLPEVETIRA 

CSMPQELP*SPRTRQPEPDFYCVKWIPWKGE 

QTPIITQSTNGPLPSPCHHEHPLSSVEGEAPPA 
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EGSDHIG 


479 


1829 


A 


3766 


2 


2152 


YSPIRLLEVCVPLPKIFIKRQAPLKVSLLQDLK 

DFFQKVSQVYVAIDERLASLKTDTFSKTREEK 

MEDIFAQKEMEEGEFKN WIEKMQ ARLMSS S 

VDTPQQLQSVFESLIAKKQSLCEVLQAWNNR 

LQDLFQQEKGRKRPSVPPSPGRLRQGEESKIS 

AMDASPRNISPGLQNGEKEDRFLTTLSSQSST 

SSTHLQLPTPPEVMSEQSVGGPPELDTASSSE 

DVFDGHLLGSTDSQVKEKSTMKAIFANLLPG 

NSY>TPIPFPFDPDKHYLMYEHERVPIAVCEKE 

PSSIIAFALSCKEYRNALEELSKATQWNSAEE 

GLPTNSTSDSRPKSSSPIRLPEMSGGQTNRTTE 

TEPQPTKKASGMLSFFRGTAGKSPDLSSQKRE 

TLRGADSAYYQVGQTGKEGTENQGVEPQDE 

VDGGDTQKKQLINPHVELQF SDANAKF YCRL 

YYAGEFHKMREV1LDSSEEDFIRSLSHSSPWQ 

ARGGKSGAAFYATEDDRFILKQMFRLEVQSF 

LDFAPHYFNY1TNAVQQKRPTALAKJLGVYRI 

GYKNSQNNTEKKLDLLVMENLFYGRKMAQ 

VPDLKG SLRNRNVKTDTGKES CD V VLLDENL 

LKMVRDNPLY1RSHSKAVLRTSIHSDSHFLSS 

HLIIDYSLLVGRDDTSNELWGIIDYIRTFTWD 

KKLEMWKSTGILGG QG* MPTWSPELYRTR 

FCEAMDNYFLMVPDHCTGLGLNC 


480 


1830 


A 


3777 


251 


3 


QGCGS AGTLIHY * * ECKMVQLLWKT V * QFLI 
KLNrxKDPAJTLDVYPNEVKNYVRTKTYTQVfr 
I/AN FIMAKS WKQPTHPS VRT 


481 


1831 


A 


3779 


333 


3 


EAAIRQPEPNILDVNQTFKDLAMIIHDQGDL1D 
S IE AN AESSEVL VERAPG QLQRPA\YYQKKSR 
KKMCLWLVQTAIILICERIM*WYTTKWSPPI 
VLPVSCFQGQKFN 


482 


1832 


A 


3780 


2 


371 


tggrqgkndhtsitekpsrdfnrhlitqni* m 
pnqdmksssnsliirkvqikptelyhhifrrka 
kmk:ttdktkyr*gfkaittlihcsqdckxq*s 

/L*ENHFMlFPKAEQHrTYDTTIPFLR 


483 


1833 


A 


3787 


43 


448 


LMKDLSPYVMETFfYILNRLNER/RSMWRHlIG 
KLPNTKDQEKILKAIRGRREVIQGS/RQQYRR 
PAAFSAAEKARKLWCS/VFNIERRNL/CEYPTK 
I.SFNTKGEMTFSDKTEFTTNRPSLKMLL.KDRI 
QEEGKMF* FCEK.CFKRKE 


484 


1834 


A 


3798 


1 


727 


FFFFETESRSVAQAGVQWCNLGSLQALPPGF\ 

SHSPASASRVAGTTGTRH*ARLIFYIFSRDGVS 

PC*PGWS* SPDLVIRPFARLPKCWDYRREPPRP 

A*FFVn.VE\QGFTMLARMVSIS*PQ/CDI,PAS 

VSQNAGITGVSHCAWPCLHFCFFGFFFEMESC 

SVAQAEVQWHDLRSLQAPPPGFTPFSCLSLPG 

SWDYRRPPPRPAKF\CrFSRDGVSPC*PGWSRS 

PDLVTRPPRPPKVLGLQA 


485 


1835 


A 


3802 


1 


239 


FFFFEMECLTVSQAGVQWYNLHSLQPLPPGF 

KQFSC\LSLPSSWD*RVPTSRPAKF/CVIF*DGV 

SHCQPGWSAWQPPLH 


486 


1836 


A 


3811 


378 


98 


RYD*SSQSENIP\QKEFLLKYP*CTATLGMRN 
MSIMKKKSIFSA£FYKVSLPSLLL\HLLAIEWG 
FHIEI QLTTHQHFLNYELESDFVinVE YM 


487 


1837 


A 


3814 


771 


320 


FDPD WTRAAGIRHEKKPKAL A YRRENSPGDL 
PPPPLPPPEEEASWAL/GAEGSRQHVLPGAGA 
QWGEESGPGRAPGSPAGAPPR*RGLAP\NSRP 
SFLSRGQGTSTCSTAG SNSSRGSS S SRG SRGPG 
RSRSRSQSRSQSQRPGQKRREEPR 
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488 


1838 


A 


3818 


1 


781 


FRACLLELIPYAPTLSWTACPPAMAGPRGLLP 

LCLLAFCLAGFSFVRGQVLFKGCDVKTTFVT 

HWCTSCAAIXKQTCPSGWLRELPDQITQDCR 

YEVQLGGSMVSMSGCRRKCRKQWQKACCP 

GYWGSRCHECPGGAETPCNGHGTCLDGMDR 

NGTCVCQENFRGSACQECQDPNRFGPDCQSV 

CSC VHG VCNHGPRGDG SCLCF AGYTGPHCD 

QELPVWQELGFPQNNPRLRKAPNCKCLPG*H 

RNGLIATPNPCRP 


489 


1839 


A 


3822 


934 


669 


FFFSEMESRSVTRLECSGAISAHLRLLGSSNSP 
ASAS*VAGTIGACHHAQLIFVFLVETGFHHVG 
QDGLDLL/NLMIHPPRPPKVLGFQA 


490 


1840 


A 


3825 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAVAEEPLHRPKKELSATKKDRVNHCLTIC 

ENIVAQSVRNSPEFQKLLGIAMELFLLCSDDA 

ESDVRMVADECLNKVIKALMDSNLPRLQLEL 

YKEIKKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNFANDNEIKVLLKAFIANLKSSSPTI 

RRTAAGSAVSICQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLL1LGVLLTLRYLVPLLQQQV 

KDTSLKGSFG VTRKEMEV SPS AEQLVQ V YEL 

TLHHTQHQDHNVVTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AGGGS S CSP VLSRKQKGKVLLGEEE ALEDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDiLSHSSSQVSAVPSDPAMDLNDG 

TQ ASSPI SD S S QTTTEGPDS A VTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGTLPDEASEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLVHCVRLLSASFLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVAJLHPESFFSKLYKVPLD 

TTEYPEEQYVSDILNYIDHGDPQVRGATALLC 

GTLICSILSRSRPHVGDWMGTIRTLTGNTFSL 

ADCIPLLRKTLKDESSVTCKL.ACTAVRNCVM 

SLCSSSYSELGLQLIIDVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNVVIHLLGDEDPRVRHVAAASL 

IRLVPKLFYKCDQGQADPWAVARDQSSVYL 

KLLMHETQPPSHFSVSTITRIYRGYNLLPSITD 

VTMENhTLSRVIAAVSHELrrSTTRAETFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMJLTLLSSAWFPLDLSAHQDAL 

ILAGNLLAASAPKSLRSSWASEEEANPAATK 

QEEVWPALGDRALVPMVEQLFSHLLKVTNIC 

AHVLDDVAPGPAIKAALPSLTNPPSLSPrRRK 

GKEKEPGEQASVPLSPKKGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHL.PS YLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLGSSSVRPGLYHYCFMAPYTHFTQALADA 

SLRNMVQAEQENDTSGWFDVLQKVSTQLKT 

NLTSVTKNRADKNA IHNHIRLF EPL VIKALKQ 
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VTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 
DSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFF 
FLVLLSYERYHSKQIIGIPKIIQLCDGIMASGR 
KAVTHAIPALQPIVHDLFVLRGTNKADAGK£ 
LETQKE WVSMLLRI ,TQYHQ VLEMFILVLQQ 
CHKErVEDKWKRLSRQlADriLPMLAKQQMHI 
DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 
VTPNTMASVSTVQLWISGILAILRVLISQSTED 
rVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 
EHSEGKQIKNLPEETFSRFLLQL VGILL EDIVT 
KQLKVEMSEQQHTFYCQELGTLLMCLIHTFKS 
GMFRRTTAAATRLFRSDGCGGSFYTLDSLNLR 
ARSMITTHPALVLLWCQILLLVNHTDYRWW 
AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 
AKLGMCNREIVRRGALILFCDYVCQNLHDSE 
HLTWLIVNH1QDLISLSHEPPV QDFISA VHRNS 
AASGLFrQArQSRCENLSTPTMLKKTLQCLEGI 
HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 
ACRRVEMLLAANLQSSMAQLPMEELNRJQEY 
LQS SGL AQRHQRLYSLLDRFRLSTMQD S L SPS 
PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 
SQCWTRSDSALLEGAELVNRIPAEDMNAFM 
MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 
AREVTLARVSGTVQQLPAVHHVFQPELPAEP 
AAYWSKLNDLFGDAALYQSLPTLARALAQY 
LVVVSKLPSHLHLPPEKEKDIVKFVVATLEAL 
SW1ILIIIEQIPLSLDLQAGLDCCCLALQLPGL 
WS VVSSTEFVTHACSLIYCVHFILE A VA VQPG 
EQLLSPERRTNTPKAISEEEEEVDPNTQNPKYI 
TAACEMVAEMVES LQS VLALGHKRNSG VPA 
FLTPLLRNniSLARLPLVNSYTRVPPLVWKLG 
WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYR 
INTLGWTSRTQFEETWATLLGVLVTQPLVME 
QEESPPEEDTERTQINVLAVQAITSLVLSAMT 
VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 
LSIIRGJVEQEIQAMVSKRENIATHHLYQAWD 
PVPSLSPA'ITGALtSHEKLLLQINPERELGSMS 
YKLGQVSIHSVWLGNSITPLREEEWDEEEEEE 
ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 
LELYSRW1LPS S SARRTPAILISE WRSLL WS 
DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 
YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 
RSSHLPSRVGALHGVLYVLECDLLDDTAKQL 
rPVISDYLLSNLKGIAHCVNIHSQQHVLVMCA 
TAFYLENYPLDVGPEFSASUQMCGVMLSGS 
l Piii YHCALRGLERLLLSEQLSRLDAESL 
VKLSVDRVNVHSPHRAMAALGLMLTCMYT 
GKEK VSPGRTSDPNPAAPDSESVI VAMER VS 
VLFDRIRKGFPCEARVVARILPQFLDDFFPPQ 

DIMNKVTGRFI ^"KJfV^PVPrtFKx a T\n/vi/^;rn-r 
uuvuiiv Yiyijrboiiyyr I ryf JVLA J V V Yf\ VJ-(J J 

LHSTG QSSMVRDWVML SLSNFTQRAPV AMA 
TWSLSCFFVSASTSPWVAAILPHVISRMGKLE 

qvdvnlfclvatdfyrhqieeeldrrafqsv 
lew aapgspyhrlltclrn vhkvttc 


491 


1UI 


A 


3826 


469 


302 


SNPPASASRVAGITGVHQHAWLIFVFLVEMEF 

hhvgqavlkxlisgdlpvsasqsa 


492 


1842 


A 


3836 


392 


88 


vapspmdmpdlyfyrdpeeiekee*aaaek:\ee 
fqsewtaw/p/eftatqsevadwfkdmqvp 

SVPIQQFPTEDWST*PTMKDWSATSTAQTTC 

wvrittewp 
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493 


1843 


A 


3838 


19 


380 


TTSDMNRAFETDTQ S IGEKNRSPSEPD YFERK 
KFKRS^EKAHIRYIGDQPEDIPLKVEFLCKHSK 
CTATLSMRNMSLMKKXCSFSEEF\LAFFPSLL 
VCHLLAIKLGFVffilHLTTFNNTF 


494 


1844 


A 


3845 


2 


352 


FFFL RRSL/D S V AQ AE AQ WLVE LGLLQ APPPG F 
KPISLP\GLPSSWDYGRPPPCPANFCIF/M*RRG 
FTVLARMVLIS* PCDPPTL ASQGTATTGM S YH 
ARPQDIDFLYAHQGRCWFRLL 


495 


1845 


A 


3847 


1774 


40 


DIFFRRAKEGMGQDEAQFSVEMPLTGKAYL 

WADKYRI^RXPRFTNRVHTGFEW'NKYNQTHY 

DFDNPPPKIVQGYKFNIFYPDLIDKRSTPEYFL 

EACADNKDFAILRFHAGPPYEDIAFKIVNRFAV 

EYSHRHGFRCQFANGIFQLWFHFKRYRYRR* 

RPWGTAGRCPRGHSKGASVKXWTPGPLSGL 

QGRGFTSHLRPHLSFARPQFPPI*KGGHH*AC 

HGELRRHWDRLA*GPDATEGALGASFEHEG 

GQQPPADLTVQADTLHRPSARLGGAHRACPK 

RRPHRVLWRWARGAWAWRCQAREKQETQG 

QPCHITGHPLGREAEPAAAGAAPALAHRPPF 

ARTGSTE\PGPCWRPIRHCRRDPLWTPTLC\RD 

WPPTHPVLAGGVHFPAAG/IGGCVEVPVSVN 

VMGTKSH*AVLPPPPSTGPGGQGLPEGWGLE 

KQEGLPPGIPPPGLLTGPW\SMRPVTPSFAH1R 

TVAPSHSPFSGQEGRGPHGCHSPGRVSGPVAGR 

LVLQHPTGTSPTEAKRKVPPGPPEGHPTSPVT 

SPRPPTAPPRHPASSGNSSVCFSKKTCRWEKK 

SFVLMELAYWQDRMFF 


496 


1846 


A 


3849 


830 


442 


AKSPLPLG*IQWR/NLGSLKLRLPGFK*FTCLG 
LLSSWDYRSLPPRPVNFCILVELGFHHVDQAG 
LKLLTSSALPALASQSAEITGMSHRIWPLPLLR 
RPPVmiRAPPQRLPFNLITSLKALSPNMATF 


497 


1847 


A 


3859 


2 


393 


ALRKTRRDGIARTGAQPAASWKGTNNYPWR 

LEMAGRPGSQEQSKDRGTGSLPPPSQRPLGPS 
PEGAGPSPPPPGIPRGGGSSSSEGP/PQLLFVPR 
RFPAPKKGLPSDTPHSKAPPTPHLILGGEDSQ 
VPIL 


498 


1848 


A 


3860 


253 


634 


KNASTVYSSQGDPKSFFFLLRWSLALVAQAG 
EQ*RDLSSLQPPPPGFK*FSCLSLPSSWD\YRCP 
LPCLANF\*FLVETGFHHVGQADLKLLTSGDP 
PTSASESAGITGVSHRAWPRIHFLYWKTFFL 


499 


1849 


A 


3863 


423 


263 


APSQISVAFLYAA/DKLFEKEI* KKIPFI1AS/DKI 

KJUINLTKEVKYLYTENYriLMKEIK/DTDKW 

KDILY*WIGKIN1*KMSTPPKAIYRFNAIPTK]P 

MTFFTEEKSnKFWNHKKPP>rrQSNIEQKE* S 

FCSn.LWWGGFLWFHMNFMIDFSISVKNVIGI 

LVGIALNL 


500 


1850 


A 


3865 


2 


15246 


LPRGCLWCLQRSPTPARPQPSRPARSPLPLFP 

DLRPWASDLDIMGDAEGEDEVQFLRTDDEV 

VLQCSATVLKEQLKLCLAAEGFGNRLCFLEP 

TSN AQNVPPDLAICCFVLEQ SLSVRALQEML 

ANTVEAGVESSQGGGHRTLLYGHAILLRHAH 

SRMYLSCLTTSRSMTDKLAFDVGLQEDATGE 

ACWWTMHPASKQRSEGEKVRVGDDIILVSVS 

SERYLHLSTASGELQVDASFMQTLWNMNPIC 

SRCEEGFVTGGHVLRLFHGHMDECLTISPADS 

DDQRRLVYYEGGAVCTHARSLWRLEPLRIS 

WSGSHLRWGQPLRVRHVTTGQYLALTEDQG 

LVWDASKAHTKATSFCFRISKEKLDVAFKR 

DVEGMGPPEKYGESLCFVQHVASGLWLTYA 
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APDPKALRLGVLKKKAMLHQEGHMDDALSL 

TRCQQEESQAARMIHSTNGLYNQFIKSLDSFS 

GKPRGSGPP AGT ALPIEGV1L SLQDL II YFEPPS 

EDLQHEEKQSKLRSLRNRQSLFQEEGMLSMV 

LNCDDRLNVYTTAAHFAEFAGEEAAESWKEI 

VNLLYELLASLJRGNRSNCALFSTOLDWLVS 

KLDRLEASSGILEVLYCVLIESPEVLNUQENHI 

KSIISLLDKHGRNHKVLDVLCSLCVCNGVAV 

RSNQDLITENLLPGRELLLQTNLINYVTSIRPN 

IFVGRAEGTTQYSKWYFEVMVDEVTPFLTAQ 

ATHLRVGWALTEGYTPYPGAGEGWGGNGV 

GDDLYSYGFDGLHLWTGHVARPVTSPGQHL 

LAPEDVISCCLDLSVPSISFRTNGCPVQGVFESF 

NLDGLFFPWSFSAGVKVRFLLGGRHGEFKF 

LPPPGYAPCHEAVLPRERLHLEPIKEYRREGP 

RGPHLVGPSRCLSHTDFVPCPVDTVQIVLPPH 

LERIREKLAENIHELWALTRTEQGWTYGPVRD 

DNKRLHPCLVDFHSLPEPERNYNLQMSGETL 

KTLLALGCHVGMADEKAEDNLKKTKLPKTY 

MMSNGYKPAPLDLSHVRLTFAQTTLVDRLAE 

NGHNVWARDRVGQGWSYSAVQDIPARKNPR 

LVPYRLLDEATKRSNRDSLCQAVRTLLGYGY 

NIEPPDQEPSQVENQSRCDRVRIFRAEKSYTV 

QSGRWYFEFEAVTTGEMRVGWARPELRPDV 

ELG ADELAYVFNGHRG QRWHLG SEPFGRP W 

QPGDVVGCMIDLTENTIIFTLNGEVLMSDSGS 

ETAFREIEIGDGFLPVCSLGPGQVGHLNLGQD 

VSSLRFFAICGLQEGFEPFAINMQRPVTTWFS 

KGLPQFEPVPLEHPHYEVSRVDGTVDTPPCLR 

LTHRTWG SQN SLVEMLFLRLSLPVQFHQHFR 

CT AG ATPL APPGL Q PPAEDEARAAEPDPD YE 

NLRRS AG G WSEAENGKEGT AKEG APGGTPQ 

AGGEAQPARAENEKDATTEKNKJCRGFLFKA 

KKVAMMTQPPATPTLPRLPHDVVPADNRDD 

PEIILNTTTYYYSVRVFAGQEPSCVWAGWVT 

PDYHQHDMSFDLSKVRWTVTMGDEQGNV 

H SSLKC SNCYMV WGGDFVSPGQQGRISHTDL 

VIGCLVDLATGLMTFTANGKESNTFFQVEPN 

TTO.FPAVFVLPTHQNV1QFELGKQKNIMPLSA 

AMFQSERKNPAPQCPPRLEMQMLMPVSWSR 

MPNHFLQ VETRRAG ERLG WA VQCQEPLTMM 

ALHIPEENRCMDILELSERLDLQRFHSHTLRL 

YRAVCALGNNRVAHALCSHVDQAQLLHALE 

DAHLPGPLRAGYYDLLISIHLESACRSRRSML 

SEYIVPLIPETRArrLFPPGRSTENGHPRHGLP 

GVGVTTSLRPPHHFSPPCFVAALPAAGAAEAP 

ARLSPAJ PLEALRDKALRMLGEA VRDGGQHA 

RDP VGA S VEFQF VPVLKL VSTLL VMGIFGDE 

DVKQILKMIEPEVFTEEEEEEDEEEEGEEEDEE 

EKEEDEEETAQEKEDEEKEEEEAAEGEKEEG 

LEEGLLQMKLPES VKLQMCHLLE YFCDQELQ 

HRVESLAAFAERYVDKLQANQRSRYGLLIKA 

FSMTAAETARRTREFRSPPQEQINMLLQFKDG 

TDEEDCPLPEEIRQDLLDFHQDLLAHCGIQLD 

GEEEEPEEETTLGSRLMSLLEKVRLVXKKEEK 

PEEERSAEESKPRSLQELVSHMWRWAQEDF 

VQSPELVRAMFSLLHRQYDGLGELLRALPRA 

YTISPS S VEDTMSLLECLGQIRSLLIVQMGPQE 

ENLMlQSIGNrNOWKVFYQHPNLMRALGMIIE 

WMEVMVNVLGGGESKJEIRFPKMVTSCCRFL 
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CYFCRISRQNQRSMFDHLSYLLENSGIGLGM 

QGSTPLDVAAASVIDNNELALALQEQDLEKV 

VSYLAGCGLQSCPMLVAKGYPDIGWKPCGG 

ERYLDFLRFAVFVNGESVEENANVWRLLIR 

KPECFGPALRGEGGSGLLAAIEEATRISEDPAR 

rX}PGIRRDRRREHFGEEPPEENRVHLGHAIMS 

FYAALIDLLGRCAPEMHLIQAGKGEALRIRA1 

LRSLVPLEDLVGnSLPLQIPTLGKDGALVQPK 

MSASFVPDHKASMVLFLDRVYGIENQDFLLH 

VLDVGFLPDMRAAASLDTATFSTTEMALAV 

NRYLCLAVLPLITKCAPLFAGTEHRAIMVDS 

MLHTVYRLSRGRSLTKAQRD V I EDCLM SLCR 

Y1RPSMLQHLLRRLVFDVPILNEFAKMPLKLL 

TNHYERCWKYYCLPTGWANFGVTSEEELHL 

TRKLFWGIFDSLAHKKYDPELYRMAMPCLC 

AIAGALPPDYVDASYSSKAEKKATVDAEGNF 

DPRP VETLN VI IPEKLD SFINKF AEYTHEKW AF 

DKIQNNWSYGENIDEELKTHPMLRPYKTFSE 

KDKErYRWPIKESLKAMIAWEWTIEKAREGE 

EEKTEKKKTAKISQ S AQTYDPREG YNPQPPDL 

SAVTLSRELQAMAEQLAENYHNTWGRKKKQ 

ELEAKGGGTHPLLVPYDTLTAKEKARDREKA 

QELLKFLQMNGYAVTRGLKDMELDSSSIEKR 

FAFGFLQQLLRWMDISQEFIAHLEAVVSSGRV 

EKSPHEQEIKFFAKILLPLrNQYFTNHCLYFLS 

TP AKVLG S GG HASNKJEKEMITSLFCICL AAL V 

RHRVSLFGTDAPAVVNCLHILARSLDARTVM 

KSGPEIVKAGLRSFFESASEDIEKMVENLRLG 

KVSQARTQVKGVGQNLTYTTVALLPVLTTLF 

QHIAQHQFGDDVILDDVQVSCYRTLCSIYSLG 

TTKNTYVEKLRPALGECLARLAAAMPVAFLE 

PQLNE YN ACS V YTTKS PRERAIL GLPN S VEEM 

CPDIPVLERLMADIGGLAESGARYTEMPHVIE 

rTLPMLCSYLPRWWERGPEAPPSALPAGAPPP 

CTAVTSDHLNSLLGNlLRJfVNNLGIDEASWM 

KRLAVFAQPIVSRARPELLQSHFIPTIGRLRKR 

AGKWSEEEQLALHAKAEAQEGELLVRDEFS 

VLCRDLYALYPLLIRYVDNNRAQWLTEPNPS 

AEEL FRM V GE IFI Y W SKSHNF KREEQNF WQ 

NEINNMSFLTADNKSKMAKAGDIQSGGSDQE 

RTKKKRRGDRYSVQTSLIVATLKKMLPIGLN 

MCAPTDQDLITLAKTRYALKDTDEEVREFLH 

NNLHLQGKVEGSPSLRWQMALYRGVPGREE 

D ADDPEKI VRRVQEV S A VL Y YLDQTEHP YKS 

KKAVWHKLLSKQRRRAVVACFRMTPLYNLP 

THRACNMFLESYKAAWTLTEDHSFEDRMIDD 

LSKAGEQEEEEEEVEEKXPDPLHQLVLHFSRT 

ALTEKSKLDEDYLYMAYADIMAKSCHLEEG 

GENGEAEEEVEVSFEEKQMEKQRLLYQQARL 

HTRGAAEMVLQMISACKGETGAMVSSTLKL 

GISILNG GN AE VQQKMLD YLKDKKEVGFFQS 

IQALMQTCS VLDLNAFERQNKAEGLGMVNE 

DGTVTNRQNGEKVMADDEFTQDLFRFLQLLC 

EGHNNDFQNYLRTQTGNTTTTNIIICTVX)YLL 

RLQESISDFYWYYSGKDVIEEQGKRNFSKAM 

SVAKQVFNSLTEYIQGPCTGNQQSLAHSRLW 

DAWGFLI rVFAHMMMKLAQDSSQIELLKEL 

LDLQKDMVVMLLSLLEGNVVNGMIARQMV 

DMLVESSSNVEMILKFFDMFLKLKDIVGSEAF 

QD YVTDPRGLISKKDFQKAMDS QKQFS GPEI 



199 



Printed from Mimosa 03/03/06 11:10:44 Page: 200 



WO 01/57188 



PCT/U SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C-Cysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
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QFLLSCSEADENEMDNCEEFANRFQEPARDIG 

FNVAVLLTNLSEHVPHDPRLHNFLELAESILE 

YFRPYLGRIEIMGASRRIERIYFEISETNRAQW 

EMPQVKESKRQFIFDWNEGGEAEKMELFVS 

FCEDTIFEMQIAAQISEPEGEPETDF.DEGAGA 

AEAGAEGAEEGAAGLEGTAATAAAGATARV 

VAAAGRALRGLSYRSLRRRVRRLRRLTAREA 

ATAVAALLWAAVTRAGAAGAGAAAGALGL 

LWGSLFGGGLVEGAKKVTVTELLAGMPDPT 

SDEVHGEQPAGPGGDADGEGASEGAGDAAE 

GAGDEEEAVHEAGPGGADGAVAVTDGGPFR 

PEGAGGLGDMGDTTPAEPPTPEGSP1LKRKLG 

VDGVEEELPPEPEPEPEPELEPEKAD AENG EK 

EEVPEPTPEPPKXQAPPSPPPKKEEAGGEFWG 

ELEVQRVKFLNYLSRNFYTLRFLALFLAFAIN 

FILLFYKVSDSPPGEDDMEGSAAGDVSGAGS 

GGS SG WGLG AGEE AEGDEDENM VY YFLEES 

TG YMEPALRCL SLLHTL V AFLCIIG YNCLKVP 

LVIFKREKELARKLEFDGLYTTEQPEDDDVKG 

DFYGRE RIAEL L GMDL ATLEIT AHNERKPN F P 
PGLLTWLMSIDVKYQIWKFGVIFTDNSFLYLG 

I HHYIsTMPPFA A T-TT I niAMfrVKTI 

VTf I tV& V I VJ. O J_r J_j VJ 1 1 I IXliL I I ru\illwIjiyirUVlVJ V FN. 1 L-r 

RTILSSVTHNGKQLVMTVGLLAWVYLYTW 

AFNFFRKFYNKSEDEDEPDMKCDDMMTCYL 

FHMYVGVRAGGGIGDEIEDPAGDEYELYRVV 

FDITFFFFVIVILLAIIOGLIIDAFGELRDOOEOV 

KEDMETKCFICGIGSDYFDTTPHGFETHTLEE 

HNLANYMFFLMYLINKDETEHTGQESYVWK 

MYQERCWDFFPAGDCFRKQYEDQLS 


501 


1851 


A 


3869 


467 


665 


VTV AIYCQLIFDKG AKTIQ* PFQQI ALv'CKRMK 
LGPCPTPCGKTNSEWIRELSVRVKTIKHLEIGV 
N 


502 


1852 


A 


18R8 


1 042 


794 


SGMO WRDLTPLOPLPPRFKOFS CLSLPGS WD 
YRHAP\PLLTNF\*FLVEMGFCYVGQAGRKLL 
ASSDQSALASQSAGITGISTAPGPPFFFLNFEA 
GSCSVAQAGVQ 


503 


1853 


A 


3891 


1773 


1193 


EVDSQSGVQ*QAPGSLQLQTPGLK/VSCLLSR 
QDYRSSLPHLASCC YYYYY Y/VFL* RRGLTTL 
VQGGT,KLLPSSNPFASAP*TAG1TGMSHCAGP 
HFNF* MFRKI SCIRE* F* HTRJYDIPFLILFFKET 
WVLLCYPGWPQIPGLKPSSCLRLLSSWDHRC 
APPCPASFFIFHVDRVSPPCPGLVSITFKMLLL 
L 


504 


1854 


B 


3896 


279 


70 


MVSK SKSILMS YNHVELTFSDMKKMPEAFRR 

TQKHT1YLIPYQVIFWSTGKDAMRSFMMPFY 

QKEYYENQ* 


505 


1855 


A 


3899 


2 


1396 


EPGVPTKKTWFDKPDFNRTNSPGFQKKVQFG 

NENTKLELRKVPPELNNISKLNEHFSRFGTLV 

NLQVAYNGDPEGALIQFATYEEAKKAISSTEA 

VLNNRFIKVYWHREGSTQQLQTTSPK.VMQPL 

VQQPXLPWKQSVKERLGPVPSSTIEPAEAQS 

ASSDLPQVLSTVLLA* QKQCIIQLL/WKAAQKT 

LLVSTSAVDNNEAQKKKQEALKLQQDVRKR 

KQEILEKHIETQKMLISKXEKNKTMKSEDKAE 

IMKTLEVLTKNITKLKDF.VKAASPGRCLPKSI 

KTKTQMQKELLDTELDLYKKMQAGEEV'liiL 

RRKYTELQLEAAKRGILSSGRGRGIHSRGRGA 

VHGRGRGRGRGRGVPGHAWDHRPRALEIS 
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AFTESDREDLLPHFAQYGEIEDCQIDDSSLHA 
VITFKTRAEAEAAAVHGARFKGQDLKLAWN 
KPVTNISAVETECVEPDEEEQREIIIA 


506 


1856 


A 


3911 


1952 


919 


DAELSGTLSLVLTQCCKRIKD TVQKLASDHK 

DIHSSVSRVGKAIDKNFDSDISSVG1DGCWQA 

DSQRLLNEVMVEHFFRQGMLDVAEELCQES 

GLSVDPSQKEPFVELNRJLEALKVRVLRPALE 

WAVSNREMLIAQNSSLEFKLHRLYFISLLMG 

GTTNQREALQ Y AKNFQPF ALNHQKDIQ VLM 

GSLVYLRQGIENSPYVHLLDANQWADICDIFT 

RDACALLGLSVESPLSVSFSAGCVALPALIN1K 

AVffiQRQCTGVWNQKLDELPIEV\DLG*KSAGY 

HSIFACPILRQQTTDNNPPMKL VCGHIIS RDAL 

NKMFNGSKLKCPYCPMEQSPGDAKQIFF 


507 


1857 


A 


3936 


439 


18 


SHPFSPAPGICPDAPPPLPRPSKGLGHPGTAGA 
PGSGARCHPPSTCSPS WASPG* G AKASPALPR 
SHG VTLLCKAQ AHLCRGEDSKDASGSTSQA 
WEPG* GA WGMPRCQGPALGSCFCPPGTTVQ 
RPAKQRDKRNRHLGR 


508 


1858 


A 


3944 


120 


412 


WCPAGTLDFPGPQEMVLLEIEVMNQLNHRNL 
IQL YAAIETPHEIVLFMEVYECPK* WGLGGGT 
TRHGASRGGVCAHSIEGGELFERTVDEDYHLT 
EV 


509 


1859 


A 


3949 


31 


392 


LTKTPSPREKGRGVLSVLLMMI*KCRVIFVKJP 
M VFFLQNFC/RIILN VA\WTGD* PNTL* KEQRG 
ITFSDSKS*YKATKIKTMWYCHKNRYID/ERN 
RIEIPEINPCICDKIIFRKLSMTTQ 


510 


1860 


A 


3954 


1013 


885 


F SETRACCPRLEHSGRIE AHC SLNIPGS SDPPT 
SASSVAATTG 


511 


1861 


A 


3956 


1 


1054 


PPAWAPRSPLIWAPTSGRHPCRAALPWSTSSV 

AEGAGKSRGSGEQDWVNRPKTVRDTLLALH 

QHGHSGPFESKFKKEPALTAVARTARKRKPS 

PEPEGEVGPPK\TTERPSRGCPHPQRGSRSP*L 

LHPLLCLRHHPLPHLIPTGPHRLKRPRM\P\SP 

MAALILVADNAGGSHASKDANQVHSTTRRN 

SNSPPSPSSMNQRRLGPREVGOQGAGNTGGL 

EP VHPASLPD SSLATS APIXCTLCHERLEDTH 

FVQCPSVPSHKFCFFCSRQS1KQQGASGEVYC 

PSGEKCPLVGSNVPWAFMQGEIATILAGDVK 

VKKERDS 


512 


1862 


A 


3957 


1086 


3 


QDRARLDCSSATSAHCNLRLPGS+DSPASASR 
VAGTTDTHHHTWLILGSSVQTGFDHVGQAG 
LELLTSGDPFISASESAGIMGMSHCVWP*SWG 
LSHHMAPPQGDGGRARGTPGPEQSFWNLSC 
H * PRCQ VPS * LMTQLyFWGRHQYNPTMKRGK 
LRHREACSLPLPGEGEPGLQPSS\* SQNPCSSPL 
FHHGL*AWLWCPELLLQGQARRH*RSPPS/FK 
CPATLSLTAWSQTKRLRSQFLLLPWL+RAL*H 
PF\CHWPSRRSLGDPLLPRSQG*RDGT*ASTFC 
S YF* DTESHL VAQAGVQ WRDLGSLQPPCPRL 
K\RFSRLSPPSSYTHRYVPSHLAESC1SSRDRIP 
PSRPDRSRN SN SLSR 


513 


1863 


A 


3961 


3038 


476 


VALTTSMCCNKQVrVlDKIKSASlADRCGALH 

VGDHILSroGTSMEYCTLAEATQFLANTTDQ 

VKLEILPHHQTRLALKGPDHVKIQRSDRQLT 

WDSWASNHSSLHTNHHYhfi"YHPDHCRVPAL 

TFPKAPPPNSPPALVSSSFSPTSMSAYSLSSLN 

MGTLPRSLYSTSPRGTNIMRRRLKKKDFKSSL 
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SLASSTVGLAOQWHTETTEVVLTADPVTGF 

GIQLQGSVFATETLSSPPLISYIEADSPAERCG 

VLQIGDRVMATNGIPTEDSTFEEASQLLRDSSI 

TSKVTLEIEFDVAESVIPSSGTTHVKLPKKHN 

VELGITISSPSSRKPGDPLVISDIKKGSVAHRT 

GTLELGDKLLAIDNIRLDNCSMBDAVQILQQC 

EDLVKLKIRKDEDNSDEQESSGAIIYTVELKR 

YG GPLGMTI SGTEEPXFDL • 11 S SLTK.GGL AERT 

G AIHIGDRIDATN SS SL KGKPLS EAIHLLQMAG 

ETVTLKIKKQTDAQSASSPKKFPISSHLSDLGD 

VEEDSSPAQKPGKJLSDMYPSHGCPSVDSAVD 

SWDGSA\DZ>TS\YGTEGT\SFQASGY\NFNTYD 

WRSPKQRGS\LSPVT\KPRSQTYPDVGLSYED 

WDRSTASGF AG AA\DS AETEQ EENFWSQ ALE 

DLETCGQSGILRELEATIMSGSTMSLNHEAFT 

PRSPAGSDRPSFQERSSSRPHYSQTTRSNTLPS 

DVGRKSVTLRXMKQEDCEIMSPTPVELHKVT 

LYKDSDMEDFGFSVADGLLEKGVYVKNTRPA 

GPGDLGGLKPYDRLLQVNHVRTRDFDCCLV 

VPLIAESGNKLDLVISRNFLASQfCSIDQQSLPG 

D*SEQNSAFFQQPSHGGNLETREPTNTL 


514 


1864 


A 


3967 


833 


800 


LEKQGVSGMATKRLARQLGLIRRXSIAPANG 

NLGRSKSKQLFDYLIVIDFESTCWNDGKHHH 

SQEnEFPAVLLNTSTGQIDSEFQAYVQPQEHPI 

LSEFCMELTGIKQAQVDEGVPLKICLSQFCK 

WIHKIQQQKNIIFATGISEPS/DF*SKIMCICYL 

VR*RJSYTY*SKHKSKGC 


515 


1865 


A 


3969 


492 


182 


CRFWGISTHCDTCDPLSPQTTEG**EGDLWSL 
DLLGPEFLARKPLFKTKTYQSTF*SISKNE/FTC 
PNFIIEEGTDLIF\*QVKHNPCHRLTPEEGTVQL 
NRADS 


516 


1866 


A 


3977 


2 


1357 


KMLC/QKE SNYIRLKRAKMDK SMF VKTKTLGI 

GAFGEVCLARKVDTKALYATKTLRKKDVLL 

RNQVAJ IVKAERDIL AEADNE WWRL YYSFQ 

DKDNLYFVMDYIPGGDMMSLLIRMGIFPESL 

ARFYIAELTCAVESVHKMGFJHRDIKPDNILID 

RDGHIKLTDFGLCTGFRWTHDSKYYQSGDHP 

RQDSMDFSNEWGDPSSCRCGDRLfCPLERRAA 

RQHQRCLAHSLVGTPNYIAPEVLLRTGYTQL 

CDWWSVGVILFEMLVGQPPFLAQTPLETQM 

KVINWQTSLHIPPQAKLSPEASDLIIKLCRGPE 

DRLGKNGADEIKAHPIF*NQFDFSQ* PEDSRS 

AFKQFP'NHTTPTDTSNFDPWDPDKLWSDDN 

EEENVNDTLNGWYKNGKHPEHAFYEFTFRRF 

FDDNGYPYNYPKPIEYEYTNSQGSEQQSDEDD 

QNTGSEIKNRDLVYV 


517 


1867 


A 


3980 


1358 


1022 


FFFKKFTQSLGFLLFSFSFLFSCFFFFHFVLFCY 
VFLDRVPLCHPGWSAVVQSQVT/VNLPPSWD 
*RCRPPH/LANLCNFCRD\SFTTLPRLVLNTWA 
QAIFQPQPPKVLGLQV 


518 


1868 


A 


3986 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*F 
SCFSLPE*LGYRHVPPCLANSVFSVEMG\FLH 
VGQAGLELLTSGDLPALASQSAGITG\SHRAR 
PENGFENIF 


519 


1869 


A 


3994 


751 


126 


NQGLRHVGLCRTCLVNQMFASSILGKSHHHS 

LISINQGHNALWKAAG\PLPLKAGYC\QSFSPC 

DSLKYG\SWDEKDLTVPQRDTHKRSVLRWIS 

QRGK\LAVEMEEGHCLLVLPLGTECLGIK\PrV 

HLFSSEMGE\NRPMVG\ARHVYSNAALLSFTP 
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LRCLGGEKHKSGLHARPVTVPSLELHYDMDSl 
AHVXFADLLLITTLPSYYIPFC 


520 


1870 


A 


3999 


882 


698 


QSFRLSLLSSWDYRHM*PRLANF*TAFFCRDRy 
SLALLPRLVSNSWPQAILPPRPPK.VLGLQT 


521 


1871 


A 


4011 


1346 


1178 


FFF* ETVSC S AS* AG VRSHDNSSLQPPSPG\S SK 
PPTSASHVAGATGTHHHAWLLSV 


522 


1872 


A 


4015 


2 


377 


QGIALLTRMGESVKHVTGGYKLRTRPLEFAA 
1GD YLDTF ALKLGTIDRIAQR1 IKEEIE YL VELR 
EYGPVYSTWSALEGELAEPLEGVSACIGNCST 
AL^ELTDDMTEDFLFVLREYILYSDSMK 


523 


1873 


A 


4018 


341 


19 


ERVIHNQIQQAQRSPHIFNARRSS/PRPNIVELP 
KVKEVCKTSKS/GQV1YKGVSIRLRANFLAEP 
L*NRREWDEAIKVLKEKQ\FLSKMVYPANLSF 
GNEGDITSFPAK 


524 


1874 


A * 


4020 


1067 


743 


FFLRWSL/DSVAQAG VKWCNLGSLQAPPPGF 
TPFSCL SLPSSWD YRHPPPRL AN* LTNFLCF* * 
RQGFT VLARMVLIS *PHDLPAS ASQ S AGITGL 
SHCSWPTSSrLS 


525 


1875 


A 


4021 


781 


351 


QFRVIFFFLRRSHSVAQAGMQWHDHSLLQPL 
PPRLKQ/F/SHLSPPSI WDYRRVPPCLVNF SI FF 
VETGSCQPCLQLLGSSNPPASASQSAGIAGISH 
QGQPE+ SFDIRFACVIAALRETFQCLCS ASR VN 
NKIINRPTHPVESSF 


526 


1876 


A 


4024 


80 


341 


TPS STSRGTEEQQ S SKMAW QRREEKEHLN VR 
RSSAEDGWK ADKP/VDG •TPGEDHLPTPSPFQ 
LHIHSSESQLHHSVKSPPSLSFRLM 


527 


1877 


A 


4026 


593 


230 


DFYLYPERKKRGQMMTAVSLTTRPQESVAFE 
DVAVYFTTKEWAIMGXPAERALYRDVMLEN 
YGGCGPL*CHPTSKPALVFS\LEQGKESCFSPA 
TGSSLSRNDWRAGWIGYLELRRYTYLS 


528 


1878 


A 


4028 


1160 


242 


GTSELLCIQRWNWGPAFPPRPGLALAPTLQLL 

VEMG S AKSVP VTPARPPPHNKHL ARV ADPRS 

PSAGILRTPIQVESSPQPGLPAGEQLEGLKHAQ 

DSDPRSPTLGIARTPMKTSSGDPPSPLVKQLSE 

VFETEDSKSNLPPEPVLPPEAPLSSELDLPLGT 

QLSVEEQMPFWNQTEFPSKQVFSKEEARQPT 

ETP VASQSSDKPSRDPETPRSS\G SMRNRWKPX 

NSSKVL\GKSPLHPSCQDDNSPGTLTLRQGKA 

AFKPLSENVSELK\EGAVILGTGR\LLKTEGRA 

WEQGQDVHDKENQHFPLVES 


529 


1879 


A 


4039 


2 


366 


KDMVLIMEMQSMITMKCPQYL*E*RKIPDITK 
CW*GCGSTGILIFC/WS*PL*KTI*QPR»FKQI*T 
ILTIIYSIM*EHTFHNAGV*LSDIYPRFMKGYV 
HTEICT*MFIAVXFVWKTWKQF 


530 


1880 


A 


4057 


358 


3 


LL^VNGNTIVTVFTKAQNKKNKGSRSILFKQL 
RXYGSRINLLKSKHDKNICTENYKT*MKEIEA 
/DTDK WKDILCS WTRRIHMKDILCS WIGRTHV 
VKISILPKVNYRFYL1SIKIIMAI 


531 


1881 


A 


4061 


50 


278 


TQGTEEIYKISSCEWVQASFSTPLITLHDFKIY 
HKATVTKM V W Y WI IRQ * KFSKN/RIESSEIEPH 
IYDQFIFDKGEKJIQEKGNSFFNN/MCWKNWIF 
T*KR 


532 


1882 


A 


4069 


19 


368 


NDLLENFKF WE* FKE* LENIN GTV1 'HKE' 1 GO V 
YKELSSPKYSGTRQFYGQTISNFPGKnSMVY 
KLFQNTE/TEGPvHPISLYEFRlTLITIPNKDNIYL 
QIWMPVSLMNrVTLKCPT 


533 

.... 


1883 


A 


4076 


1 


355 


PIRXFTKVAG * K.SNTPK* LAFLHINNEQFENKl/ 
rTNI/PFI IASKRJKY S GISLTKEMKDLYTETLLR 
FUKEDTNKWKDL'SCFWVGR/LNIVKMPK,VIC 
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IFNAIPIKMPMMCMAK1EKNSS 


534 


1884 


A 


4088 


3 


1931 


IIDS STRRMESERSPLYRQLIDLG YL SSSHWNC 

GAPGQDTKAQSMLVEQSEKLRHLSTFSHQVL 

QTRLVDAAKALNLVHCHCLDIFINQAFDMQR 

DLQITPKJU.EYTRKKENELYESLMN1ANRKQE 

EMKDMIVETLNTMKEELLDDATNMEFKDVI 

VPEN GEPVGTREIKCCIRQIQELI1 SRLNQ A VA 

NKLIS S VD YLRESF VGTLERCLQSLEKSQD V S 

VHITSNYLKQILNAAYHVEVTFHSGSSVTRM 

LWEQIKQIIQRrrWVSPPAITLEWKRKVAQEAI 

ESLSASKLAKSICSQFRTRLNSSHEAFAASLRQ 

LEAGHS GRLEKTEDL WLRVRKDHAPRLARLS 

LESRSLQDVLLHRKPKLGQELGRGQYGWYL 

CDNWGGHFPCALKSVVPPDEKHWNDLALEF 

HYMRSLPKHERLVDLHGS VID YN YGGGS SI A 

VLLIMERLHRDI.YTGLKAGLTLETRLQrALDV 

VEGlRFLHSQGLVHRDIKLKN\ r LLDKQNRAKI 

TDLGFCKPEAMMSGSIVGTPIHMAPELFTGK 

YDNSVDVYAFGELFWY1CSGSVKLPEAFERCA 

SKDHLWNNVRRGARPERLPVFDEECWQLME 

ACWDGDPLKRPLLGIVQPMLOGIMNRJLCKS\ 

NSEQPNRGLDDST 


535 


1885 


A 


4090 


2 


417 


ALMPHEANYEEIFLKTDKDMDGFESGLEVRE 

IFLKTR/GLPSTLLAHIWALCDSKDCGKLSKD 

HFALAFHLmQKLlKGIDPPLVLTPEKISPSNR 

ASLQKVTELTRKPVCIIFKGTILWRITDSIWMK 

HNRKRIWLRA 


536 


1886 


A 


4102 


569 


829 


DHQK*KNIPCSWIGRINIVKMSILPKAIYRFSAI 
P1KIFMTFFTEI*S*NVYRTTKTQE*AKAILSKK 
EQNLEESHYLDFK* YYRAV 


537 


1887 


A 


4104 


54 


281 


SIDCEHLIRRMLVLDPSKRLTIAQIKEHKWML 
IEVPVQRPVLYPQEQENEPSIGEFNEQVLRLM 
HSLGIDQQKTTE 


538 


1888 


A 


4109 


141 


314 


IRHIPLKIRSWSHLKCFYKFILTFFFAGCSQPL 
VPRENITAWMNAIGLIITALPVS 


539 


1889 


A 


4111 


268 


1 


ASRP WGHS YP* FNQQE VDTLKRPIASS EI* MM 
I+KFATVKKSPGPYRFTAEFSHTFKEDLVPILW 
PLFPKIYREGTLPHSFYEASITL 


540 


1890 


A 


4142 


198 


2064 


PEPGAGRAATPWGPLFWRGRGSGRCEKAAE 

AALGDFLGLHRRTQQPAVDRLLSDASAQWR 

VRGHOGVRESGRAPQQPGRRRGRRPRKRPR 

GRWRREGCGAOGRGVCVAAWSQRSIAGNN 

DYRLFHKMSNSHPLRPFTAVGEIDHVHILSEH 

IGALLIGEEYGDVTFVVEKKRFPAHRVILAAR 

CQYFRALLYGGMRESQPEAEIPLQDTTAEAFT 

MLLKYIYTGRATLTDEKEEVLLDFLSLAHKY 

GFPELEDSTSEYLCTILNIQNVCMTFDVASLY 

SLPKLTCMCCMFMDRNAQE VL SSEGFL SLSK 

TALLNIVLRDSFAAPEKDIFLALLNWCKHNSK. 

ENHAELMQAVRLPLMSLJELLNVVRPSGLLSP 

DAILDAIKVRSESRDMDLNYRGMLIPEENIAT 

MKYGAQWKGELKSALLDGDTQNYDLDHG 

FSRHPIDDDCRSGIEIKLGQPSIINHVRJLLWDR 

DSRSYSYFIEVSMDELDWVRVIDHSQYLCRS 

WQFXYFPARVCRYIRJVGrHNTVNKIFHIVAF 

ECMFTNKTFTLEKGLIVPMENV ATIADC AS VI 

EGVSRSRNALLNGDTKNYDWDSGYTCHQLG 

SGAIVVQLAQPYMIGSIRVLLWDCDDRSY 


541 j 1891 


A 


4146 


282 


778 


GTLGYPNGARGQPQDNFFAHQWSHHPPISAC 
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HAESENFAF WQDMKWKNKFWGKSLEI VP VG 
TVNVSLPPvFGDHFEWNKVTSCIHNVLSGQRW 
IEHYGE VLIRNTQD S SCHCK1TFCKAKY WS SN 
VHEVQGAVLSRSGRVLHRLFGKWHEGLYRG 
PTPGGOCIWKP 


542 


1892 


A 


4147 


44 


433 


SVDAYVCNDIVFSYRTTITLLEGA*LTHRYVA 

QDPKQGQLRSLHLTCDSAPAGSQGTWSTSCR 

INHLIFRGGAQITFLATFDDSPKAVLGDRLLLT 

ANVSSENNTPRTSKTTFQLELSVKDAVYTW 

SSH 


543 


1893 


A 


4153 


678 


11 


TISYPQCLTQMYFLISFANVDTFLLPIMALDH 

YVAICSALQ*CSirrP/ELCQGLPVLA*AGSSLIS 

PVHTVIMSRLAFCSSAQISHFYRDAYLLMKIA 

CSHT* \NQHVFLGA VVLFLAPCALlLrVS YIRIA 

AAILRIPSPTRRRKACSICSSHLSLVTLFYGTV 

LGICI * PPDSF s aqdaiatimytv VT SMLNPFI y 

SLMNKEVQEAVRRJLFSRGSHSSWCW 


544 


1894 




4158 


3 


538 


LLYAQAGVQ*LNLSSLQPQPAGLKQSSHPSLP 

SSWDYRYSTPHPANFFVEMEFHHVAQAGLEL 

LGSGDLPTSTSHSAGITGVNSHHAPPRLISSEGS 

LLGHLLCLPMVFPLLCVFVLISSSLAGEEAAG 

LRVQKLWPAVVLSHLPVCWFHCSGIWSEV1E 

LKVGREGHVLPWQAHWEF 


545 


1895 


A 


4160 


1 


412 


HPLGLGLVPSEIFSPQDKKAADGSILAPARGE 
DLEAGLKGSFMDGRLQ AS VS VFR1QRVG S AM 
QDTASAMPCLPYYPTSHCFMAGGKSRSQGW 
ELELSGEP APG WQ VL AGYTYTQ ARYLRD A SE 
ANVGQPLRPVDPR 


546 


18% 


A 


4174 


1252 


1190 


FFQVFIFLFLIFFKTEFHSCCPGAVQWHDLDSL 
QPPPPRFKGFSCLSLPSSWDYRHAPAHPANFV 
FLVETGFLHV\GO\ASLELPTSGDTPAS\ASQSA 
GITG VSHHA * PRASGRRCW 


547 


1897 


A 


4176 


3029 


1 


AGPDGL AAPASC QGARGQTRVPGAFS WLAP 

GSHHASEGLAPGVPPAGGVSAQELTAPPQEG 

WGLGAPPAAPRPESDEKRAGSDAVRSFSRGA 

RDSLGQRRLGGTRGAGPAGKGAQRTMGPAS 

GFHSFPPRPHQEPSPRSSCWQHLLWHCPWPQ 

PSRLPRLTPAQLLQGPGVLAAPPGP*HVPGFL 

AQ SP WPLPSGPRSP*DPLHQG AL VPLPQGGSP 

HTAPHCLPSVLSPA1QQPLLPTAST/SSRSPPAS 

TMAPIPSALAVWEPAGSSPQLSSAPADSS\PLP 

ALPKVLPPWTQKPLLGCLCQSPLPLLSPPDQI/ 

RCPPACSPAAASSFSFESQPCPSAPSKASPAPA 

ALVIVGPHHPP* SQQPQSQS VHPHGPGGPQPPL 

AASS LF WMFCQPPPPHPQFL WHRPLPVTGKA 

LASXPLCFRPAPGSLRQTPLPPQFHIPRPGLSAP/ 

PPPASGTSDSSDSRSPSASAARVWPPA\SPPPP 

AARHRPHPPEYFLSPCPFSCGFPRLLGRPRRPQ 

ALQTPRAWDLPPGSSPAPLCSGPELP*AFPPLP 

PFPRVA*LGSGHPPSAQVPGLW*RCV*GHPIP 

RPVGHS* SGPPHSPPL* APPQAWPLELPPSRQC 

LQPLHLRAAQPLDPCCSLSPPGPPLPVPALPS 

WPGRP*SPSPASSQPPYHAGLPGPQSSPLPPGL 

PQLPSLRSGSQQPLLFFQCPGPGAVWGKGSPQ 

PLSPHPPPP/ARTQTFPVASRSLSPGTAPYSVCL 

TPSRSASSLPEWLASSLPKJPQSSGSU'LGPTSP 

MP*CFHRPSPPLP/LSSPFPA\LRPQAPQFPLHLP 

P»PPAPSPGCPLPPLAQQHQPSPPSPHARSTLT 

PPLWPSLAIXP*PLPPPPPVPSFSASLLCSLPAH 
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GTPASPGLGRSCLGKPQTLPWISFWPPSGRLA 
PGTWQPW/PVSPAPLSCLSAWDPWELPSPQPQ 
VCSTAELPTSCLLSSPGP\PAFQPPRFGCL*GPP 
GPPGLPPLQSSLSFPPPPPPVPQPPAPPALQWG 
LHLPGGRTK 


548 


1898 


A 


4180 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTK 

KIQFHQELLVLFWKLCDFNKVGQPRGALQGD 

GEQLPQ*PGGRDSVRLRGVGQSCPSLELSPLG 

PSPHP*KFLFFVLKSSDVLDILVPILFFLNDAR 

ADQSRVGLMHIGVFILLLLSGECNFGVRJLNKP 

YS[RVPMDIPVFTGTHADLLiV\VFHKIiTSGHQ 

RLQPLFDCLLTIVVNVSPYLKSLSMVTANKLL 

HLLEAFSTTWFLFSAAQNHHLVF1T.LEVFNNI 

1QYQFDGNSNLVYAJIRKRSIFHQLANLPTDPP 

TIHKALQRJRRRTPEPLSRTOSQGGAPPWRAPA 

PLPLQSQAPSRPVWWLLQALTS*PRSPRCQR 

MAPCGPWNLSPSRAWRMAARLRGSPARHGG 

SSGDRP/HSSASGQWSPTPEWVLSWKSKLPLQ 

TIMRLLQ VL VPQ VEKICIDKGLTDESEILRFLQ 

HGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


549 


1899 


A 


4191 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKT 

ALLQDGRRKVHYLFPDGKEMAEEYDEKTSE 

LLVRKWRVKSALGAMGQWQLEVGDPAPLG 

AGNLGPELIKESNANPIFMRKDTKMSFQWRTR 

NLPYPKDVYSVSVDQKERCnVRTTNKKYYK 

KFSIPDLDRHQLPLDDALLSFA\TPTAP 


550 


1900 


A 


4192 


1 


1980 


IRHTGSDIAGVCGWLLLSGPCGVGLDLDSRLL 

GAS AMRRSE VL AEES I VCLQKALNHLREI WE 

LIGIPEDQRLQRTEVVKKHIKELLDMMIAEEE 

SLKERLIKSISVCQKELNTLCSELHVEPFQEEG 

ETTIL QL £ KDL RTQ VEL MRKQKKERK QEALKL 

LQEQDQELOEILCMPHYDIDSASVPSLEELNQ 

FRQHVTTLRETKASRRE EF/V S SIKRQ I IL C ME 

ELDHTPDTSFERD WCEDED AF CLSLENIATXL 

QKLLRQ\LEMQKSQNEAVCEG\LRTQI\RELW 

DRLQIPEEEREAVATTMSGSKAKVRK\ALQ\LE 

VDRLEELEKCKTMKKVIEAIRVELVQYWDQC 

FYSQEQRQAFAPFCAEDYTESLLQLHDAEIVR 

LKNYYEVHKELFEGVQKWEETWRLFLEFER 

KASDPNRFTT4RGGNLLKEEKQRAKLQKMLP 

KLEEELKARIELWEQEHSKAFMVNGQKFME 

YVAEQWEMHRLEKERAKQERQLKNKKQTET 

EMLYGSAPRTPSKRRGLAPNTPGKARKLNTT 

TMSNATANSSIRPEFGGTVYHSPVSRLPPSGSK 

PVAASTCSGKKTPRTGRHGANKENUELNGSI 

LSGGYPGSAPLQRNFSrNSVASTYSEFADPSLS 

DSSTVGLQRELSKASKSDATSG1LNSTNIQS 


551 


1901 


A 


4194 


3 


1008 


AWHEGLVSSPAIGAYLSASYGDSLWLVATV 

VALLDICFILVAVPESLPEKMRPVSWGAQISW 

KQADPFASLKK VGKD STVLLMCIT VCLS YLPE 

AG\QYSSFF\LYUR\QVIGFG\TVKIAAFIAMVGI 

L SIV AQTAFLSILMRSLGNKNT VLLGLGFQML 

QLAWYGFGSQAWMMWAAGTVAAMSSrrFP 

AI S ALVSRNAESDQQG VAQG IITGIRGL CNGL 

GPALYGFIFYMFHVELTELGPKLNSNNVPLQ 

GAVIPGPPFLFGACIVLMSFLVALFIPEYSKAS 

GVQKHSNSSSGSLTNTPERGSDEDIEPLLQDS 

SIWELSSFEEPGNQCTEL 



206 



Printed from Mimosa 03/03/06 11:10:50 Page: 207 



WO 01/57188 



PCT/US01/U3800 



Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E-GIutamic Acid, 
F-Phenylalanine, G-Glycine, H-Histidinc, 
I=Iso!etJcine, lOLysine, L=Leucine, 
M=Methionine, N=Asparagine. P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonuie, V=VaJine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *-Stop codon, 
/-possible nucleotide deletion, possible 
nucleotide insertion 



SEQLD 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



SEQ 


Predicted 


Predicted end 


ID NO: 


beginning 


nucleotide 


in 


nucleotide 
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correspondi 
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sequence 




peptide 






sequence 





552 



1902 



4197 



14302 



ARPPPAPGSRQQKQKAAPGAAAAAELRGAR 

EPAPARRRGTMADGG EGEDEI QFURTDDE W 

LQCTATIHKEQQKLCLAAEGFGNRLCFLESTS 

NSKKVPPDLSICTFVLEQSLSVRALQEMLANT 

VEKSEGQVDVEKWKFMMKTAQGGGHRTLL 

YGHAILLRHSYSGMYLCCLSTSRSSTDKLAFD 

VGLQEDTTGEACWWTIHPASKQRSEGEKVR 

VGDDLILVSVSSERYLHLSYGNGSLHVDAAF 

QQTLWSVAPISSGSEAAQGYLIGGDVLRLLH 

GHMDECLTVPS GEHGEEQRRTVHYEGG AVS 

VHARSLWRLETLRVAWSGSHIRWGQPFRLR 

H VTTGKYI SLMEDKNLLLMDKEKAD VKSTA 

FTFRSSKEKLDVGVRKEVDGMGTSEIKYGDS 

VC YIQH VDTGLWLTYQSVD VKSVRMGS I QR 

KAIMHHEGHMDDGISLSRSQHEESRTARVIRS 

TVFLFNRF1RGLD ALSKKAKASTVDLPIES V SL 

SLQDLIGYFHPPDEHLEHEDKQNRLRALKNR 

QNLFQEEGMINL VLECIDRLH VY SS AAHF AD 

VAGREAGESWKSILNSLYELLAALJRGNRKN 

CAQFSGSLDWLISRLERLEASSGILEVLHCVL 

VESPEALNlIKEGFDKSnSLLDKHGRNHKVLD 

VLCSLCVCHGVAVRSNQHLICDNLLPGRDLL 

LQTRLVNHVSSMRPNIFLGVSEGSAQYKKWY 

YELMVDHTEPFVTAEATHLRVGWASTEGYSP 

YPG GGEEWGGNG V GDDLFS YG FDGLHLW SG 

C I ARTVSS PNQHLLRTDDVIS CCLDLS APSISF 

RrNGQPVQGMFENFNIDGLFFPVVSFSAGIKV 

RFLLGGRHGEFKFLPPPGYAPCYEAVLPKEKL 

KVEHSREYKQERTYTRDLLGPTA'SLTQAAFT 

PIPVDTSQIVLPPHLER1REKLAENIHELWVMN 

KIELGWQYGPVRDDNKRQHPCLVEFSKLPEQ 

ERNYNLQMSLETLKTLLALGCHVGISDEHAE 

DKVKKMKLPKNYQLTSGYKPAPMDLSFIKLT 

PSQEAMVDKLAENAHNVWARDRTRQGWTY 

G1QQDVKNRRNPRJLVPYTPLDDRTKKSNKDS 

LREAVRTLLGYGYNLEAPDQDHAARAEVCS 

GTGERFRIFRAEKTYAVKAGRWYFEFETVTA 

GDMRVGWSRPGCQPDQELGSDERAFAFDGF 

KAQRWHQGNEHYGRSWQAGDWGCMVDM 

NEHTMMFTLNGErLLDDSG SELAFKDFD VGD 

GFIPVCSLGVAQVGRMNFGKDVSTLKYFTIC 

GLQEGYEPFAVNTNRDITMWLSKRLPQFLQV 

P SNHEHDE VTRIDGTID SSPCLKVTQKSFGSQN 

SNTDIMFYRLSMPIECAEVFSKTVAGGLPGAG 

LFGPKNDLEDYDADSDFEVLMKTAHGHLVP 

DRVDKDKEATKPEFNNHKDYAQEKPSRLKQ 

RFLLRRTKPDYSTSHSARLTEDVLADDRDDY 

DFXMQTSTYYYSVRIFPGQEPANVWVGWrTS 

DFHQYUTGFDLDRVRTVTVTLGDEKGKVHE 

SIKRSNCYMVCAGESMSPGQGRNNNGLEIGC 

WDAASGLLTFIANGKELSTYYQVEPSTKLFP 

AVFAQATSPNVFQFELGRIKNVMFLSAGLFKS 

EHKNPVPQCPPRLHVQFLSHVLWSRMPNQFL 

KVDVSRISERQGWLVOCLDPLQFMSLHIPEEN 

RSVDILELTEQEELLKFHYHTLRLYSAVCALG 

NHRVAHALCSHVDEPQLLYAIENKYMPGLLR 

AGYYDLL1DIHLS S Y ATARLMMNNEYI VPMT 

EETKSITLFPDENKKHGLPGIGLSTSLRPRMQF 

SSPSFVSISNECYQYSPEFPLDILKSKTIQMLTE 

AVKEGSLHARDPVGGTTEFLFVPLtKLFYTLLr 
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D-Aspartic Acid, E-<jlutamic Acid, 
F=Phcnylalanine, G=<31ycine, H^Histidine, 
I=Iso leucine, K=Lysine, L=Leucinc, 
M^Methionine, N=Asparagine, P=Proline J 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/-^possible nucleotide deletion, \=possib!e 
nucleotide insertion 














MGIFHNEDLKHILQLIEPSVFKEAATPEEESDT 

LEKELSVDDAKLQGAGEEEAKGGKRPKEGLL 

QMKLPEPVKLQMCLLLQYLCDCQVRHIUEAI 

VAFSDDFVAKLQDNQRFRYNEVMQALNMSA 

ALTARKTKEFRSPPQEQINT4LLNFKDDKSECP 

CPEEIRJDQLLDFHEDLMTHCGIELDEDGSLDG 

NSDLTIRGRLLSLVEKVTYLKKKQAEKPVES 

DSKKSSTLQQLISETMVRWAQESV1EDPELVR 

AMFVLLHRQYDGIGGLVRALPKTYTINGVSV 

EDTINLLASLGQIRSLL S VRMGKEEEKLMIRG 

LGD1MNNKVFYQHPNLMRALGMHETVMEV 

MVNVLGGGESKErTFPKMVANCCRFLCYFCR 

1SRQNQKAMFDHLSYLLENSSVGLASPAMRG 

STPLDVAAASVMDNNELALALREPDLEKWR 

YLAGCGLQSCQMLVSKGYPDIGWNPVEGER 

YLDFLRFAVFCNGESVEENANVWRLLIRRPE 

CFGPALRGEGGNGLLAAMEEAIKIAEDPSRD 

GPSPNSGSSKTLDTEEEEDDTIHMGNAIMTFY 

SALIDLLGRCAPEMHLIHAGKGEAIRIRSILRS 

LIPLGDLVGVISIAFQMPTIAKDGNVVEPDMS 

AGFCPDHKAAMVLFLDRVYGIEVQDFLLHLL 

EVGFLPDLRAAASLDTAALSATDMALALNRY 

LCTAVLPLLTRCAPLFAGTEHHASLIDSLLHT 

VYRJLSKGCSLTKAQRDSIEVCLLS1CGQLRPS 

MMQHLLRRLVFDVPLLNEHAKMPLKLLTNH 

YERCWKYYCLPGGWGNFGAASEEELHLSRK 

LFWGIFDALSQKKYEQELFKJLALPCLSAVAG 

ALPPDYMESNYVSMMEKQSSMD SEGNFNPQ 

PVDTSNITIPEKLEYFINKYAEHSHDKWSMDK 

LANG WI YGEIYSD S SKVQPLMKPYKLLSEKE 

KEIYRWPIKESLKTMLARTMRTERTREGDSM 

ALYNRTRRISQTSQVSVDAAHGYSPRAIDMS 

NVTL SRDLHAMAEMMAENYHNI WAKKKKM 

ELESKGGGNHPLLVPYDTLTAKEKAKDREKA 

QDILKFLQINGYAVSRGFKDLELDTPSIEKRFA 

YSFLQQLIRYVDEAHQYILEFDGGSRGKGEHF 

P YEQEDCFFAK WLPLIDQ YFKN HRL YFLS AA 

SRPLCSGGHASNKEKEMVTSLFCKLGVLVRH 

RI SLFGNDATSIVNCLHILGQTLD ARTVMKTG 

LESVKSALRAFLDNAAEDLEKTMENLKQGQF 

THTRNQ PKG VTQ I INYTT V ALLPMLS SLFEHI 

GQHQFGEDLILEDVQVSCYRILTSLYALGTSK 

SrYVERQRSALGECLAAFAGAFPVAFLETHLD 

KHNIYSIYNTKSSRERAALSLPTNVEDVCPNIP 

SLEKLMEEIVELAESGIRYTQMPHVMEVILPM 

LCSYMSRWWEHGPENNPERAEMCCTALNSE 

HMhrTLLGMLKirYNNLGIDEGA VVMKRLA VF 

SQPIINKVKPQLLKTHFLPLMEKLKKKAATVV 

SEEDFiLKAEARGDMSEAELLILDEFTTLARDL 

YAFYPLLIRFVDYNRAKWLKEPNPEAEELFR 

MVAEVFIYWSKSIINFKREEQNFVVQNEINN 

MSFLrmTKSKMSKAAVSDQERKKMKRKGD 

RY SMQTSLI VAALKRLLPIGLNIC APGDQELI A 

LAXNRFSLKDTEDEVRDIIRSNIHLQGKLEDP 

AIRWQMALYKDLPNRTDDTSDPEKTVERVL 

DIANVLFHLEQKSKRVGRRHYCLVEHPQRSK 

KA V WHKLLSKQRKRA VVA CFRMAPL YNLPR 

HRAVNLFLQGYEKSWIETEEHYFEDKLIEDLA 

KPGAEPPEEDEGTKRVDPLHQLILLFSRTAT ,T 

EKCKLEEDFLYMAYADIMAKSCHDEEDDDG 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possible 
nucleotide insertion 














EEEVKSFEEKEMEKQKLLYQQARLHDRGAA 

EMVLQTISASKGETGPMVAATLKLGIAILNGG 

NSTVQQKMLDYLKEKKDVGFFQSLAGLMQS 

CSVLDLNAFERQNKAEGLGMVTEEGSGEKV 

LQDDEFTCDLFRFLQLLCEGHNSDFQNYLRT 

QTGNNTTVNI[ISTVDYLLRVQESISDFYWYY 

SGKDV1DEQGQRNFSKAIQVAKQVFNTLTEY1 

QGPCTGNQQSLAHSRLWDAVVGFLHVFAHM 

QxMKLSQDSSQEELLKELMDLQKDMWMLLS 

MLEGNWNGTIGKQMVDMLVESSNNVEMIL 

KFFDMFLKLKDLTSSDTFKEYDPDGKGV1SK 

RDFHKAMESHKHYTQSETEFLLSCAETDENE 

TLDYEEFVKRFHEPAKDIGFNVAVLLTNLSEH 

MFNDTRLQTFLELAESVLNYFQPFLGRIEIMG 

SAKRIERVYFEISESSRTQWEKPQVKESKRQFI 

FDWNEGGEKEKMELFVNFCEDT1FEMQLAA 

QI SE SDLNERS ANKEESEKERPEE QGPRM AFF 

SILTVRSALFALRYNILTLMRMLSLKSLKKQM 

KJC VKKMT\^KDMVT AFFSS YWS IFMTL LHFV 

ASVFRGFFRJICSLLLGGSLVEGAKKIKVAELL 

ANMPDPTQDEVRGDGEEGERKPLEAALPSED 

LTDLKJ2LTEESDLL SDIFGLDLKREGGQYKLIP 

HNPNAGLSDLMSKPVPMPEVQEKFQEQKAK 

EEEKEEKEETKSEPEKAEGEDGEKEEKAKED 

KGKQKLRQLHTHRYGEPEVPESAFWKKIIAY 

QQKLLN^FARNFYNMRMLALFVAFAINFILL 

FYKVSTSSWEGKELPTRSSSENAKVTSLDSS 

SHRIIAVHYVLEESSGYMEPTVRJLPILHTV1SF 

FCIIGYYCLKVPLVIFKREKEVARKLEFDGLYI 

TEQPSEDDIKGQWDRLViNTQSFPNNYWDKF 

VKRKVMDKYGEFYGRDRJSELLGMDKAALD 

FSDAREKJOCPKKDSSLSAVLNSIDVKYQMW 

KLGWFTDNSFLYLAWYMT 


553 


1903 


A 


4199 


31 


767 


LPELNGRGAGLRRAEPSERGGGAERTQQVAA 
LPLSHGHSHGGGGCRCAAER/VGAARGSAAC 
AYGLYLRJDKGRLQCLNESREGSGRGVFKPW 
ERAD\DRSKFVE SD ADEELLFNIPFTG\HVKLK 
GIIIMGEDDDSFTPSEMRLYKNTPQMSFDDTER 
EPDQ r rFSLNRDLTGELEYATKISRFSNVYHLSl 
HISKNFGADTTKVFYIGLRGEWTEUUIHEVTI 
CNYEASANPADHRVHQVTPQTHFIS 


554 


1904 


A 


4200 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSL 

EICIKACKNIJ^YGEEKJCKKCNPYVKTYLLPD 

RSSQGKRKTGVQRNTVDPTFQETLKYQVAPA 

QLVTRQLQVSVWHLGTLARRVFLGEVIIPLAT 

WDFEDSTTQSFRWHPLRAKADKYEDSVPQS 

NGELTVRAKLVLPSRTRKLQEAQEGTDQPSL 

HGQLCLVVLGAKNLPVRPDGTLNSFVKGCLT 

LPDQQKLRLKSFVLRKQACPQWKHSFVFSGV 

TP AQLRQS SLELTVWTX?ALFGMNDRLLGGT\ 

RLGSKGDTAVGGDACSQSKXQWQKVLSSPN 

LWTDMTLVLII 


555 


1905 


A 


4211 


331 


2419 


KENKKARNLRMNQSRSRSDGGSEETLPQDH 

NHHENERRWQQFRl.HRF.EAYYQFTNELNDE 

DYRLMRDHNLLGTPGEITSEELQQRLDGVKE 

QLASQPDLRDGTNYRDSEVPRESSHEDSLLE 

WLNTFRRTGK ATR SGQNGN QTW RA V S RTNP 

NNGEFRFSLEIHVNHENRGFEIHGEDYTOIPLS 

DSNRDHTANRQQRSTxSPVARRTRSQTSVNFN 

GSSSNIPRTRLASRGQNPAEGSFSTLGRLRNGI 
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M=Methiomne, N-Asparagine, P=Proline 1 
Q=Glutamine, R=Arginine T S=Serine, 
T=Threonine, V— Valine, W-Tryptophan, 
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GGAAGIPRANASRTNFSSHTKQSGGSELRQRE 

GQRFGAAHVWENGARSNVTVRNTNQRLEPI 

RLRSTSNSRSRSP1QRQSGTVYI INSQRESRPV 

OOTTRRSVRRRGRTRVFLEODRERERRGTAY 

TPFSNSRI-VSRITVEEGEESSRSSTAVRRHPTIT 

LDLQVRVRIRPGENRDRJJSIANRTRSKVGLAE 

NTVTIESNSGGFRRTISRLERSGIRTYVSTITVP 

LRRI SENEL VEPS SV ALRSILRQIMTGFGELS SL 

MEADSESELQRNGQHLPDMHS ELSNLGTDN 

NRSQHREGSSQDRQAQGDSTEMHGENETTQP 

HTRNSDSRGGRQLRNPNNLVETGTLPILRLAH 

FFLLNESDDDDRIRGLTKEQIDNLSTRHYEHN 

SIDSELGK1CSVCISDYVTGNKLRQLPCMHEF 

HIHCIDRWLSENCTCPICRQPVLGSN1ANNG 


556 


1906 


A 


4212 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKR 
KSPENTEGKDGSKVTKQEFTRRSARLSAKPA 
PPKPEPKPRKTSAKKEPG AKJ SRG AKGKKEEK 
QEAGKEGTAPSENGETKAEEMISRSTVNVST 
SRGTPPSTLSVKGQIETVRVKGTEN 


557 


1907 


A 


4213 


774 


507 


ARRFSCLTLQTSWGHRH\GPPRP\ANFVFLVET 

GFLH1GQAGHKLPTSGDPPASASQSARITGMS 

HRTWFLASFLIDSCKNFIVYKIMYTL 


558 


1908 


A 


4225 


3 


1253 


TYRHAEREHPETSSATKVSYDYRHKRPKLLD 

GDQDFSDGRTQKYCKEEDRXYSFQKGPLNRE 

LDCFNTGRGRETQDGQVKEPFKPSKKDSIAC 

SNSSSNQLDKSQKLPDVKPSPINLRKKSLTVK 

VDVKKTVDTFRVA SSY STERQMSHDL VAVG 

RKSENFHPVFEHLDSTQNTENKPTGEFAQEirr 

TIHOVKANYFPSPGITLHFRFSNKMADIHKADV 

NE1PLNSDPEIHRRIDMSLAELQSKQAV1YESE 

QTL1KIIDPNDLRHDIERRRKERLQNEDEHIFHI 

ASAAERDDQN S SFSK3VYTTQRKDIITHKPFEV 

EGNHRNTRVRPFKSNFRGGRCQPNYKSGLVQ 

KSLYIQAKYQRLRFTGPRGFITHKFRERLMRK 

KKVP 


559 


1909 


A 


4235 


1 


323 


KFSIPFFLRWSFTLVXPRLEGNDMISVHCNLGL 
LGLSHSPASASQVGGITGTQHHTGLIFGFL1ET 
EFHHVGQAGLELLTSGDPPALAFQSAGITGVS 
HHAWLQVLNS 


560 


1910 


A 


4246 


2 


1569 


TLSLLERVLMKDIVTPVPQEEVKTV1RKCLEQ 

AALVNYSRLSEYAK1EGKKREMYELPVFCLA 

SQVMDLT1QNQKDAENVGRLITPAKKLEDTIR 

LAELVIEVLQQNEEHHAEAFAWWSDLMVEH 

AETFLSLFAVDMDAALEVQPPDTWDSFPLFQ 

LL\NDFLRTGLLICGNGK\FHKHLQDLFAPLW 

R/YMWDLDGSSPIAQSIHRGLLSRESWEPVNN 

GSGTSEDLFWKLDALQTFIRDLHWPEEEFGK 

HLEQRLKLMASDMIESCVXRTR\IAFEVKLQK 

TSSIQQEFRVPQFNMAPCFNVMGLMAKGSIQP 

KJ^\CSMEMGQEFAKMWHQYHSK1DEL1EETV 

KEMITLLVAKFVTILEGVLAKLSRYDEGTLFS 

SFLSFTVKAASKYVDVPKPGMDVADAYVTF 

VRHSQDVLRDKVNEEMYIERLFDQWrNSSM 

WICrWLTDRNfDLQLHIYQLKTLIRMVKKTY 

RDFRLQGVLDSTLN SKTYETTRNRLTVEEATA 

SV SEGGGLQGI SMKDSDEEDEEDD 


561 


1911 


A 


4257 


1300 


654 


SELVQFLLIKDQKKIPIKRAD1LKHVIGDYKDI 
FPDLFKRAAERLQYVFGYKLVELEPKSNTYTL 
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INTLEPVEEDAEMRGDQGTPTTGLLMIVLGLI 

FMKG>TITKJCTEAWDFLLAL\GVYPTKKHLIFG 

DPKKLITEDFVRQRYLEYRRIPHTDPVDYEFQ 

WGPRTNLETSKMKVLKFVAKVHNQDPKDW 

PAQYCEALADEENRARPQPSGPAPSS 


562 


1912 


A 


4260 


1 


1498 


MVTWLYRFLFTSNMAAKLRSLLPPDLRLQF 

WLHARLQKCFLSRGCGSYCAGAKASPLPGK 

MAMGLMCGRRELLRLLQSGRRVHSVAGPSQ 

WLGKPLTTRLLFPAAPCCCRPHYLFLAASGPR 

SLSTSAISFAEVQVQAPPWAATPSPTAVPEV 

ASGETADWQTAAEQSFAELGLGSYTPVGLI 

QNLLEFMHVDLGLPWWGAIAACTVFARCLIF 

PUVTGQREAAJRIHNHLPEIQKFSSRIREAKLA 

GDHIEYYKASSEMALYQKJCHG1KLYKPLILPV 

TQAPinSFFIALREMANLPVPSLQTGGLWWF 

QDLTVSDPI YILPLA VTATM WA VLELG AETG 

VQSSDLQWMRNVIRMMPLiTLPITMHFPTAV 

FMYWLSSNLFSLVQVSCLRIPAVRTVLKIPQR 

WHDLDKLPPREGFLESFKKGWKNAEMTRQ 

LREREQRMRNQLELAARGPLRQTFTHNPLLQ 

PGKDNPPNTPSS\SSSSSKPKSKYPWHDTLG 


563 


1913 


A 


4265 


623 


116 


MG G L APTQTLEPTNREYQNTQLS V S YLLPEQN 
THGTRRTLSSGPSNNLPLPLSSSATMPSMQCK 
HRSPNGGLFRQSPVKTTPPIPMSFQPVPGGVNL 
PRGSGNPPHGTSILT APPALL PHPPTHPTQQSF 
LIQENNNTNHTHSHTHTYTETLSFFLYICVNN 
DRMEWGKSVF 


564 


1914 


A 


4270 


3 


368 


ELKRKLSSLNSEVSTIQNTRMLAFKATAQLFIL 
GCTWCLGLLQVGPAAQVMAYLFTirNSLQGF 
FIFL V YCLLS\QQ VQKQYQK WFREI VK S KS ES 
ETYTL S SKMGPDSKPSEGD VFPRTSE 


565 


1915 


A 


4288 


83 


406 


RNSRPLWCSPPASQPRQAPVSQSCCCPLPSSSS 
PPSALLAPTKPRALGTLRLYECSPELCTTMLP 
PAWLLMLCQAPRPQDPDPRLTQPEKSLQEAP 
GQTGASRTPRT 


566 


1916 


A 


4298 


1041 


229 


LNSSQKLACLIGVEGGHSLDSSLSVLRSFYVL 

GVRYLTLTFTCSTPWAESSnCFRHHMYTNVS 

GLTSFGEKVVEELNRLGMMIDLSYASDTLLRR 

VLEVSQAPVEFSHSAARAVCDNLLNVPDDILQ 

LLKKNGGIVMVTLSMGVLQCNLLANVSTVA 

DHFDHERAVIGSEFIGIGGNYDGTGRFPQGL\E 

DVSTYPVLIEELLSRSWSEEELQGVLRGNLLR 

VFRQVEKVREESRAQSPVEAEFPYGQLSTSCH 

FHLGASEWTPRLLIWR 


567 


1917 


A 


4299 


1 


1106 


GATPLG SVGGRTGKMDAATLT YDTLRF AEFE 

DFPETSEPVW1LGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAJGGTGPTSDTG WGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVLKKLAVFDTWSSLAVHIAMD 

NTVVMEEIRRLCRTSVPCAGATAFPADSDRH 

CNGFPAGAEVTNRPSPWRPLVLLIPLRLGLTD 

INEAYVETLKHCFMMPQSLGVIGGKPNSAHY 

FIGYVGEELIYLDPHTTQPAVEPTDGCFIPDES 

FHCQHPPCRMSIAELDPSIAVVRGGHLSTQAF 

GAECCLGMTRKTFGFLRFFFSMLG 


568 


1918 


A 


4300 


2012 


1843 


SRKFLTITPIVLYFLTSFYTXYDQIHFVLNTVS 
LMS VLIPKLPQLHG VRIFGIN K Y 


569 


1919 


A 


4302 


186 


531 


WTFCLFIVWWVPESARWLLTQGHVKEAHRY 
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LXHCARLNGRPVCEDSFSQEVRVNVCVSMW" 
CVWWGVGCVKCLPPRAHHIWQEKPLGPHRT 


570 


1920 


A 


4308 


3 


869 


RSGQGKVY GLIGRRRFQQMDVLEGLNLLITI S " 

GKRNTCLRVYYT.SWT,RNKILHKDPEVEKXQG 

WTTVGDMEGCGHYRVVKYERI1CFLVIALKSS 

VE VY A W APKP YHKFMAFKSFADLPHRPJLL V 

DLTVEEGQRLKVIYGSS AGFHAVD VD SGN S Y 

Ui i IrVHiybQITPHAIurLPNTDGMEMLLCYE 

DEGVYVNTYGRIIKDWLQWGEMPTSVAYIC 

SNQIMGWGEKAIEIRSVETGHLDGVFMHKRA 

QRLKFLCERNDKVFFASVRSGGSSQVYFMTL 

NRNCIMNW 


571 


1 921 


A 


4309 ~ 


9 


524 


ASREMDVTKVCGEMRYQLNKTNMEKDEAE 
KEHREFRAKTNRDLEIKDQE1EKLRIELDESK 
QHLEQEQQKAALAREECLRLTELLGESEHQL 
HLTRQEKDSIQQSFSKEAKAQALQAQQREQE 
LTQKJQQMEAQHDKTENEQYLLLTSQNTFLT 
KLKEECCTLAKKLEQISQ 


572 


1922 


A 


4318 


1 


1119 


GATPLGS VGGRTGKMDAA'I'LT YDTLRFAEFE 

DFPETSEPVWILGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAIGGTGPTSDTGWGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPJDSYFS 

VLNAFLDRKDSYYS1HQIAQMGVGEGKSIGQ 

W YGPNTV AQ VLKKLAVFDTWS SLA VHIAMD 

NTVVMEEIRRLCRTSVPCAGATAFPADSDRH 

CNGFPAGAEVTNRPSPWRPLYLLJPLRLGLNT 

DINEAYVXETT AKHCFHGWPQFPG/VV HREG K 

HNaAHYFIGYVGEELrYLDPHTTQPAVEPTDG 

CFTPDESFHCQHPPCRMSIAELDPSIAVVRGGH 

LSTQAFGAECCLGMTRKTFGFLRFFFSMLG 


573 


1923 


A 


4333 


363 


1066 


GOVPVGLASKPFQILYGHTNEVLSVGISTELD 
MAVSGSRDGTV1IHTIQKGQYMRTLRPPCESS 
LFLTIPNLAI SWEGHTVVYSSTEEKTTLK\ERM 
HYJCFSINGKYLGSQlLKEQVSDICriGEHIVTG 
SIQGFLSIRDLHSLNLSINPLAMRLPIHCVCVT 
KEYSHILVGLEDGKLrWGVGKPAEVKPSISN 
r ianA v KjV Y t CiiPbFQLIEKSPLGINKLKAKFD 
FSKGSK 


574 


1924 


A 


4346 


359 


1234 


MDTLEEVTWANGSTALPPPLAPN1SVPHRCLL 
LLYEDIGTSRVRYWDLLLLIPNVLFLrFLLWK 
LPSARAKIRITSSPIFITFYILVFVVALVGIARA 
WSMTVSTSNAATVADKILWE1TRFFLLAIEL 
o v m^ui-ftr unJ^r-afLbMpLKVLAriTVLSLAYS V 
TQGTLFJL YFDAHL SAEDFNI YGHGGRQF WL 
VSSCFFFL V Y SL WILPKTPLKERI SLP SRRSF Y 
VYAGELALLNLLQGLGSVLLCFDIIEGLCCVD 
ATTFLYFSFFAPLIYVAFLRGFFGSEPKir .F 


575 


1925 


A 


4360 


2038 


1512 


GCWWRHPWLASQRDCLDCRJQLAEKFVKAV 

SKPSRPDMNPIRVKEVYRLEEMEKIFVRLEM 

KilKGSSGTPKLS YTGRDDRHFVPMGL YIVRT 

VNEPWTMGFSKSFKKKFFYNKKTKDSTFDLP 

ADSIAPFHICYYGRLFWEWGDGIRVHDSQKP 

QDQDKLSKEDVLSFIQMHRA 


576 


1926 


A 


4365 


59 


500 

1 
( 

! 


QVEGRQGREYKRTAWRISPVWRPARCRRRST 

PQP/PF7PGAQQQERHRQGEAPMQALDPRAEP 

5PQAQSHAACQPEPEPPRVLLDPTAARGGVQ 

3RP/GLSRHPGLAPHPQTHTPWPQSGRLPCAS 

iPLPLGGERPTPGLEPKGRDLM 
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577 


1927 


A 


4366 


i 785 


502 


S APP KKKN G VLF L S PRLKS S G AI WV H STPTL W 
AS SNS RASTPKV AGITG ARPHARJ1FVFLIEMG 
FHNVGQAGL/DTLTLVICPPQPPKLLGLQM 


578 


1928 


A 


4367 


1 


221 


FFFFLKKSRCVTQAGVQGXPISLHPPPPGFICRF 

SRLSLLSSWDYRHP/HAANFCIFSRDGYVSPYW 

SGWSRTPDLR 


579 


1929 


A 


4383 


1 


224 


FETESHSVTQAGMQWHNLGSLQPMP/PGLKR 
FSCLRLQSSWDHRHAPPHLAHFCIFSRDGVSP 
CWPQWSSTPDLK 


580 


1930 


A 


4397 


410 


94 


SRLKPYSTNVTAKXLPATKIPNLDCFTAKLYQ 
WFKKGIMHILHELFQNKEEGAFPNS/FYEASFT 
LRPKSDRDIAKEE S YSTISLL STDTKJLMSKYK 
QLKSSDL 


581 


1931 


A 


4414 


670 


3 


VLVHRQCGGILRLRRKEAVSVLDSADIEVTDS 
RLPHATIVDHRPQHRWLETCNAPPQLIQGKA 
RSAPKPSQASGHFSVELVRGYAGFGLTLGGG 
RDVAGDTPLAVRGLLKDGP\AQRCGRLEVGD 
LVLI IINGESTQGLTMIAQAVERIRAGGPQLHL 
V] RRPLETHPGKPRG VGEPRKG WPS WPDRSP 
DPGGPEVTGSRSSSTSLVQHPPSRTTLKKTRG 
SPE 


552 


1932 


A 


4424 


194 


449 


VLYIRKKKRLEKLRHQLMPMYNFDPTEEQDE 

LEQELLEHGRDAASVQAATSVQAMQGKTTL 

PS\QGPLQRPSRLVFRDVANAIHV 


583 


1933 


A 


4435 


1 


166 


APGPPVPPPGSPPEQMPGPCPASMPP/DPPPGS 
PPEQMPGPCPVSAPP/GPPPGSPPEQMPGPCPV 
SAPPALLQDTSV 


584 


1934 


A 


4439 


1 


628 


SATPQQPSAPQHQGTLNQPPVPGMDESMSYQ 

APPQQLPSAQPPQPSNPPHGAHTLNSGPQPGT 

APATQHSQAGPATGQAYGPHTYTEPAKPKK 

GQQLWNRMKPAPGTOiVSSSTSRSDPLLLPPR 

AIAFTQRASTVVLAPSPT/SEKVQNHSGSSAR 

GNL SGKPDDWP/LGHERVCGALLHRL* VGGG 

QGPHGKAAQGGAAGAAAGRLGLYH 


585 


1935 


A 


4463 


10 


144 


HKPVTNSRDTQEVPLEKAKQVLK11ATFKHTT 
SIFDDFAHYEKRQ 


586 


1936 


A 


4464 


1309 


103 


LN AES YVSFTTKLDEPTAAK YEYGVPL QTSDS 

FLRFPSSLTSSLCTDNNPAAFXVNQAVKCTRK 

INLEQCEEIE AL SMAFY SSPEILRVPDSRKKVPI 

TVQSrVIQSLNKTLTRREDTDVLQPTLVNAGH 

FSLCVNVVLEVKYSLTYTDAGEVTKADLSFV 

LGTVS S VVVPLQQKFEIHFLQENTQP VPLSGN 

FGYWGLPLAAGFQPHKGSGUQTTNRYGQLT 

ILHSTTEQDCLALEGVRTPVLFGYTMQSGCK 

LRLTGALPCQLVAQKVKSLLWGQGFPDYVA 

PFGNSQGP/ADMLDWWIHFITQSTNRKDSCQ 

LPGALV1EVKWTKYGSLLNPQAKIVNVTANLI 

S S SFPEANSGNERTTLISTAVTF VDVS APAE AG 

FRAPP AINARLPFNFFFPFV 


587 


1937 


A 


4471 


614 


387 


LLGRASAC/LQLQSSW/D/HRPMLPYLANFVF 

CKDR/SFTWLPRLVLNSWLQVILLPWPPTGCD 

NKHEPPCPATKRRHSGSI 


588 


1938 


A 


4480 


1720 


1458 


HDLGSLQPPPPGFKRFSCLSLPSSWDYRLMPP 
CPANFCIIII/DFL VETGFHHVGQ A SHELLT SGD 
PPTSASQSAGrTGMSYHTWFGES 


589 


1939 


A 


4487 


922 


332 


APVTTSPRVGQPW/RTALALRSLYRARPSLRC 
PPVELPWAPRRGHRLSPADDELYQRTRISLLQ 
REAAQAMYIDSYNSRGFM1NGNRVLGPCALL 
PHSVVQWNVGSHQD1TEDSFSLFWLLEPRIEI 
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VWGTGDRTERLQSQVLQAMRQRGIAVEVQ 
DTPNACATFNFLCHEGRVTGAAL1PPPGGTSL 
TSLGQAAQ 


590 


1940 


A 


4492 


1 


472 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 
PFSCLSLPSSWDYRRPPLRPANFFVFLVETGFP 
RFSRDGLDLLT/S/GDPPTSASQSAGITGVSHR 
ARPKRI GEPRRKCGN A VV WPSTSLGDHRVTS 
VPHQGGLPGPIRVAPSSAGQREASQGPPGR 


591 


1941 


A 


4495 


1444 


1 116 


IAARFTLAKT\WQLKRP\TMIDSIKKTR\YIYT 
MEYYADTERNEIMSFVAGTWVELEAIILSKLM 
LKDNWVEDTIPQGAVPCTATAEGMKRLLFAL 
EPWDSSCFPHPSSGV 


592 


1942 


A 


4496 


2 


919 


RTRPLFSGRPTRPVCTMSDERRLPGSAVGWL 

VCGGLSLLANAWGILSVGAKQKKWKPLEFL 

LCTLAATHMLNVAVPIATYSWQLRRQRPDF 

EWNEGLCKVFVSTFYTLTLATCFSVTSLSYHR 

MWMVCWPVNYRLSNAKKQAGHTVMGIWM 

GSFILSALPAVGWHDTSERFYTHGCRFIVAEI 

GLGFGVCFLLL VG GS V AMGVICTAI ALFQTL 

A VQVGRQ ADHRAPTVPTIVVED AQGKRRS SI 

DGSEPAKTSLOTTGLVTTIVFrYDCLMGFPVL 

GFFSLADTHLSDLPYTWGDRDSGGACVM 


593 


1943 


A 


4506 


2 


193 


FFFEAESCSVPQAGVQRPDLGWLHAPPPNGSC 
HFPASASQVAGTTHARHHTQLIF\AFLVENGL 
C 


594 


1944 


A 


4507 


1327 


647 


KMAGG VRPLRGLRALC RVLLFLS QFCILSGG 

E STEIPPYVMKCPSNGLCSRLPADCLDCTTNFS 

CTYGKPVTFDCAVKPSVTCVDQDFKSQKNFU 

NMTCRFCWQLPETDYECTNSTSCMTVSCPRQ 

RYPANCTVR\DHVHCLGNRTFPKMLYCNWT 

GGYKWVYGLWLLRHHPRWGLGADRRYLGP 

VAGTASGKLFSFGGLG1WTLIDVLLIGVGYVG 

PADGSLYI 


595 


1945 


A 


4512 


533 


264 


FFFKMESYSVARLECSGAISAPCNLHLLGSNN 
SPASASRV/AGNIGARHHTQQIFVLLVQMRVH 
YVGQDGLDLL/NLMIHPPRSPKVLGLQA 


596 


1946 


A 


4513 


3 


1674 


HASDHLYPNFI.VNT-r.n.KQKQRFEEKRFKLD 

HSVS STNGHRWQIFQD WLGTDQDNLDLANV 

NLMLELLVQKKKQLEAESHAAQLQILMEFLK 

VARRNKREQLEQIQKELSVLEEDIKRVEEMS 

GLYSPVSEDSTVPQFEAPSPSHSSI1DSTEYSQP 

PGFSGSSQTKXQPWYNSTLASRRKRLTAHFE 

DLEQCYFSTRMSRISDDSRTASQLDEFQEOLS 

KJATRYNSVRPUATLSYASDLYNGSQYKSLV 

FEFDRDCDYFAIAGVTKKIKVYEYDTVIQDA 

VDIHYPENEMTCNSK1SCISWSSYHKNLLASS 

DYEGTVILWDGFTGQRSKVYQEHEKRCWSV 

DFNLMDPKLLASGSDDAKVKLWSTNLDNSV 

ASIEAKANVCCVKFSPSSRYHLAFGCADHCV 

HYYDLRNTXQPIMVF-KGHRKAVSYAKFVSG 

EEIVSASTDSQLKLWNVGKPXYCLRSFKGHIN 

EKKFWGLASNGDYIACGSENNSLYLYYKGLS 

KTLLTFKFDTVKSVLDKDRKEDDTNEFVSAV 

CWRALPDGESNVLIAANS\QGTI\KVLELV 


597 


1947 


A 


4518 


536 


824 


RSLALSPGLECSGM1SAHCNLHLLGSSDPPTS 
ASQVAEITSVRHHTWLDPCM.GQMGFHHVGE 
QAGLELLTSWDPA1LPSQSAGIIGMSPHAWPP 


598 


1948 


A 


4524 


1 i 384 

I 


FDTEFVNIGGDFDAAAGVFRXCRLPGAYFFSF 
TLGKLPRKTLSVKLMKNRDEVQAMIYDDGSS 
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Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nudeotide insertion 














RRREMQSQSVMLALRRGDAVWLLSHDHDG 
YGAYSNHGKYTTFSGFLVYPDLAPAAPPGLG 
ASELL 


599 


1949 


A 


4526 


366 

i 


776 


MGQPAPYAEGPIQGGDAGELCKCDFLVFTSP 
NPEAVCEAGTPAMFQTAWRQMESCSI/AQAG 
VQWRDPGSLHPPPLGFKRFSCLSLPSSWDYK 
HAPPHPANFCIFSRDQVSPCWPGWSRSLDLVT 
PPPWLPKVLGLQA 


600 


1950 


A 


4529 


776 


334 


FFFETESCYVAQAGVQWCDLCSLQAPPPG\SS 
DPPASASRVAGTTGARHHTQLIFVFLVETGFH 
\ML ARDGLKLLTS SDPP AS AS QS S WD YRREPP 
RLANFFVFLVETGSRYVAQAGVQWLFTGAIP 
LLISTG VLTCS V SDLGRFTPP 


601 


1951 


A 


4533 


1460 


403 


HEVQESIHFLESEFSRGISDNYTLALITYALSS 

VGSPKAKEALNMLTWRAEQEGGMQFW V SSE 

SKLSDSWQPRSLDEEVAAYALLSHFLQFQTSE 

GIP1MRWLSRQRNSLGGFASTQDTTVALKALS 

EFAALMNTERTNIQVTVTGPSSPSPVKFLIDT 

HNRLLLQTAELADGTANGSV/SISANGFGFAI 

CQLNWYNVKASGSSRRRRSIQNQEAFDLDV 

AVKENKDDLNHVDLNVCTSFSGPGRSGMAL 

MEVNLLSGFMVPSEAISLSETVKKVEYDHGK 

LNLYLDSVNETQFCVNIPAVRNFKVSNTQDA 

SVSIVDYYEPRRQAVRSYNSEVKLSSCDLCSD 

VQRLPSL 


602 


1952 


A 


4540 


1963 


295 


MRAPGRPALRPLPLPPLLLLLLSSPWGRAVPC 

VSGGLPKPANITFLSINMKNVLQWTPPEGLQG 

VKVTYTVQYFIYGQKKWLNKSECRNrNRTYC 

DLSAETSDYEHQYYAKVKAIWGTKCSKWAE 

SGRFYPFLETQIGPPEVALTTDEKSISWLTAP 

EKWKRNPEDLPVSMQQIYSNLKYNVSVXNT 

KSNRTWSQCVTKHTLVLTWNLEPNTLYCVHV 

ESFVPGPPRRAQPSEKQCARTLKDQSSEFKAK 

rTFWYVLPISITVFLFSVMGYSIYRYIHVGyCEK 

HP\ANLILIYG\NEFDKRFFVPA\EKIV\INF]\TL 

NIS\DDSKISHQDMSLLGKSSDVSSLNDPQPSG 

NLRPPQEEEEVKHLGYASHLMEIFCDSEENT\ 

EGTSFTQQESLSRT1PPDKTVIEYEYDVRTTDI 

CAGPEEQELSLQEEVSTQGTLLESQAALAVL 

GPQTLQYSYTPQLQDLDPLAQEHTDSEEGPEE 

EPSTTLVDWDPQTGRLCIPSLSSFDQDSEGCE 

PSEGDGLGEEGLLSRJLYEEPAPDRPPGENETY 

LMQFMEEWGLYVQMEN 


603 


1953 


A 


4543 


3 


600 


YSA VEFVEQ ASGISD W WNP ALRKRMLSD S GL 

GMLAPYYEDSDLKDLSHSRVLQSPVSSEDHAI 

LQ A VIAGDLMKLIES YKNGGSLLIQGPDHCSL 

LHYAAETGNGEIVKYILDHGPSELLDMADSE 

TGETALHKAACQRNRAVCQLLVDAGASLRKX 

TDSKGKTPQERAQQA\GDPDLAAATIESRQN 

YKVIGHEDLETAV 


604 


1954 


A 


4548 


3 


938 


QDNKVQNGSLHQKDTVHDNDFEPYLTGQAN 

QSNSYPSMSDPYLSSYYPPSIGFPYSLNEAPW 

STAGDPPIPYLTTYGQLSNGDHHFMHDAVFG 

QPGGLGNMYQHRFNFFPENPAFSAWGTSGS 

QGQ<3TQS S A YGS S YTYPPSSLGGTVVDGQPG 

FHSDTLSKAPGMNSLEQGMVGLKIGDVSSSA 

VKTVGSVVSSVAL'I GVLSGNGGTNVNMPVS 

KPTSWAAIASKPAKPQPKMKTKSGPVMGGG 

LPPPPIKHNMDIGTWDNKGPVPKAPVPQQAP 
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SPQAAPQPQQVAQPLPAQPPALAQPQYQSPQ 
QPPQ 


605 


1955 


A 


4553 


2 


2304 


ILLQEKRNCLLMQLEEATRLTSYLQSQLKSLC 

ASTLTVSSGSSRGSLASSRGSLASSRGSLSSVS 

FTDrYGLPQYEKPDAEGSQLLRFDLIPFDSLGR 

DAPFSEPPGPSGFHKQRRSLDTPQSLASLSSRS 

SLSSLSPPSSPLDTPFLPASRDSPLAQLADSCE 

GPGLGALDRLRAHASAMGDEDLPGMAALQP 

HGVPGDGEGPHERGPPPASAPVGGTVTLRED 

SAKRLERRARPJSACLSDYSLASDSGVFEPLT 

KRNED AEEPA YGDTASNGDPQIH VGLLRDS G 

SECLLVHVLQLKNPAGLAVKEDCKVHIRVYL 

PPLDSGTPNTYCSKALEFQVPLVFNEVFRIPV 

HSSALiUCSLQL YVCS VTPQLQKELLGIAQrN 

LADYDSLSEMQLRWHSVQVFTS\LNHQGRGR 

LGVQERAPPGTLHTPSPSPA/STDAVTVLLAR 

TTAQLQAVERELAEERAKLEYTEEEVLEMER 

KEEQ AE AISERS W QADSVDSGCSNCTQTSPPY 

PEPCCMGIDSILGHPFAAQAGPYSPEKFQPSPL 

KVDKETNTEDLFLEEAASLVKERPSRRARGSP 

FVRSGTIVRSQTFSPGARSQYVCRLYRSDSDS 

STLPRKSPFVRNTLERRTLRYKQ SCRSSL AEL 

MARTSLDLELDLQASRTRQRQLNEELCALRE 

LRQRLEDAQLRGQTOLPPWVLRDERLRGLLR 

EAERQTRQTKLDYRHEQAAEKMLKKASICEI 

YQLRGQSHKEPIQVQTFREKIAFFTRPKJNIPPL 

PADDV 


606 


1956 


A 


4555 


3429 


776 


PGSGPGPAPFLAPVAAPVGGISFHLQIGLSREP 

VLLLQDSSGDYSLAHVREMACS1VDQKFPEC 

GFYGMYDKTLLFRHDPTSENILQLVKAASDIQ 

EGDLIEWLSASATFEDFQIRPHALFVHSYRA 

PAFCDHCGEMLWGLVXRQGLKCEGCGLNYH 

KRCAFKIPNNCSGVRRRRLSNVSLTGVSTIRT 

SSAELSTSAPDEPLLQKSPSESFIGREKRSNSQ 

SYIGRPfflUJKILMSKVKVPHTFVIHSYTRPTV 

CQYCKKLLKGLFRQGLQCKDCRFN CHKRCA 

PKVPNNCLGEVTTNGDLLSPGAESDWMEEG 

SDDNDSERNSGLMDDMEEAMVQDAEMAMA 

ECQNDSGEMQDPDPDHEDANRTISPSTSNNIP 

LNtRVVQSVKHTKRKSSTVMKEGWMVHYTS 

KDTLRKRHYWRLDSKCITLFQNDTGSRYYKE 

rPLSEDLSLEPVKTSALrPNGANPHCFEITTANV 

VYYVGENVVNPSSPSPNNSVLTSGVGADVAR 

MWEIAIQHALMPVIPKGSSVGTGTNLHRDISV 

SISVSNCQIQENVDISTVYQrFPDEVLGSGQFGI 

VYGGKHRKTGRDVAIKJIDKLRFPTKQESQLR 

NEVAILQNLHHPGVVNLECMFETPERVFYVM 

EKLHGDMLEMILSSEKGRLPEHITKFUTQILV 

AIJtHLHFKNIVHCDLKPENVLLASADPFPQV 

KLCDFGF ARI1GEK.SFRRS V VGTPA YL APE VL 

RNKGYNRSLDM WS VG VirY VSL SGTFPFNED 

EDIHDQIQNAAFMYPPNPWKEISHEAIDLINN 

LLQVKMRKRYSVDKTLSHPWLQDYQTWLDL 

RELECKIGERYITHESDDLRWEKYAGEQGLQ 

YPTHLJNPSASHSDTPETTETEMKALGERVSlL 


607 


1957 


A 


4563 


1 


4499 


SRPW WLRASERPS APS AMAKRS RGPGRRCLL 
ALVLFCAWGTLAWAQKPGAGCPSRCLCFRT 
TVRCMHLLLEA VP A V APQTS I L DLRFNRJ REI 
QPGAFRRLR14LNTLLLKKNQIKRIPSGAFEDL 
ENLKYLYLYKNEIQSIDRQAFKGLASLEQLYL 
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HFNQIETLDPDSFQHLPKJLERLFLHNNRJTHL 

VPGTFNHLESMKKLRIJ2SNTLHCDCE1LWLA 

DLLKTYAESGNAQAAAICEYPRRIQGRSVATI 

ITEELNCERPRITSEPQDADVTSGNTVYTTCR 

AEGNPKPEIIWLRNNNELSMKTDSRLNLLDD 

GTLMIQNTQETDQGIYQCMAKNVAGEVTCTQ 

EVT1.RYFGSPARPTFVTQPQNTEVLVGESVTL 

ECSATGHPPPRJSWTRGDRTPLPVDPRVNITPS 

GGLYIQNWQGDSGEYACSATNNIDSVHATA 

FUVQALPQFTVTPQDRWIEGQTVDFQCEAK 

GNPPPVIAWTKGGSQLSVDRRHLVLSSGTLRI 

SGVALHDQGQYECQAVNIIGSQKVVAHLTVQ 

PRVTPVFASIPSDTTVEVGANVQLPCSSQGEP 

EPAITWNKDGVQVTESGKFHISPEGFLTINDV 

GPADAGRYECVARNTIGSASVSMVLSVNVPD 

VSRNGDPFVATSrVEAIATVDRAJNSTRTHLF 

DSRPRSPNDLLALFRYPRDPYTVEQARAGEIF 

ERTLQLI QEHVQHGLMVDLNGT S YHYNDLV S 

PQYLNL1ANLSGCTAHRRVNNCSDMCFHQKY 

RTHDGTCNNLQHPMWGASLTAFERLLKSVY 

ENGFNTPRGINPHRLYNGHALPMPRLVSTTLI 

GTETVTPDEQFTHMLMQ WGQFLDI EDLDSTV 

VALSQARPSDGQHCSNVCSNDPPCFSVMIPPN 

DSRARSG ARCMFFVRS SPVCGS GMTSLI -MNS 

VYPREQ1NQLTSYIDASNVYGSTEHEARSIRD 

LASHRGLLRQGrVQRSGKPLLPFATGPPTECM 

RDENESPIPCFLAGDHRANEQLGLTSMHTLW 

FREHNRIATELLKLNPHWDGDTIYYETRKIVG 

AEIQHITYQHWLPKILGEVGMRTLGEYHGYD 

PGl>fAGIFNAFAT\AAFRFGHTLVNPLLLPGLD 

ENFQP1AQDHLPLHKAFFSPFRIVNEGGIDPLL 

RGLFGVAGKMRVPSQLLNTELTERLFSMAHT 

VALDLAAINIQRGRDHGIPPYHDYRVYCNLS 

AAHTFEDLKNEKNPEIREKLKRLYGSTLNID 

LFPALWEDLVPGSRLGPTLMCLLSTQFKRLR 

DGDRLWYENPGVFSPAQLTQIKQTSLARILCD 

NADNTTRVQSD VFRVAEFPHG YG SCDEIPRVD 

LRVWQDCCEDCRTRGQFNAFSYHFRGRRSLE 

FSYQEDKPTKKTRPRK1PSVGRQGEHLSNSTS 

A'\FSTRSDASG\T7*JDFQRVCSWEMQKTITDLR 

TQIKKLESR\LSTTECVDAGGESHANNTKWK 

KDACTICECKDGQVTCFVEACPPATCAVPVNI 

PGACCPVCLQKRAEEKP 


608 


1958 


A 


4566 


354 


1135 


FSFLC/GVSGRLGLDSEEDYYTPQKVDVPKAL 
nVAVQCGCDGTFLLTQSGKVLACGLNEFNKL 
GLNQCMSGIINHEAYHEVPYTTSFTLAKQLSF 
YKIRTIAPGKTHTAAIDERGRLLTFGCNKCGQ 
LGVGNYKKRLGINLLGGPLGGKQVIRVSCGD 
EFTIAATDDNHIFAWGNGGNGRLAMrPTERP 
HGSDICTSWPRPIFGSLHHVPDLSCRGWHTILI 
VEKVLNSKTIRSNSSGLSIGTVFQSSSPGGGGE 
GGPDAW 


609 


1959 


A 


4567 


1 


412 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 

PFSCLSLPSSWDYRRPPLRPANFFVFLVETGF 

HRFSRDGLDLLT/S/GDPPASASQSAGITGVSH 

RARPRJNLRKVIYSFAVTYCLNYISLAMSSTL 

KLSFHVLSGS 


610 


1960 


A 


4570 


697 


467 


ECRGVISAH\CCTLCLPSSSDSASAF\RVARTT 
GTCDYAQLIFAFLVEMGFHHVGQDGLHLiyN 
LVIRPPRPPKVLGLQA 
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611 


1961 


A 


4571 


25 


1396 


ADPHTTVIRFFPAASATKRVLPPVLRVSSPRT 
WNPN VPESPR1PAPRLPKRMS G APTAG AALM 
LCAATAVLLSAQGGPVQSKSPRFASWDEMN 
VLAHGLLQLGQG\CANT\GAHPQSAERAGA\R 
LSACGSACQGTEGSTDI PLAPESRVDPEVLHS 
LQTQLKAQNSRIQQLFHKVAQQQRHLEKQHL 
R1QHLQSQFGLLDHKHLDHEVAKPARRKRLP 
EMAQPVDPAHNVSRLHRLPRDCQELFQVGER 
QSGLFEIQPQGSPPFLVNCKMTSDGOWTVIQR 
RHDGSVDFNRPWEAYKAGFGDPHGEFWLGL 
EKVHSITGDRN SRL A VQLRD WDGNAELLQFS 
VHLGGEDTAYSLQLTAPVAGQLGATTVPPSG 
LSVPFSTWDQDHDLRRDKNCAKSLSGGWWF 
' GTCSHSNLNGQYFRSIPQQRQKLKKGIFWKT 
WRGRYYPLQATTMLIQPMAAEAAS 


612 


1962 


A 


4575 


162 


3 


FFFETESRSVAQAGVQWRDLSSLQPPPPG\SR 
GSPA S ASPV AGITGTRHHRTRG 


613 


1963 


A 


4584 


687 


321 


PLAQRRPFLWVTVKTNGHiWGSSTYPHFWGS 
SNS/PASASQVAGIPNARHQARIIFVFLVEPRF 
HHVGRAGLGFL/NLAICLPQHPKVLGLQACN 
LNIKJHPAHKYISMIQFNVHFMCMSVHIYI 


614 


1964 


A 


4589 


727 


299 


PGS AQSAQRG RGRRRARAGS ATQITM YSFMG 

GGLFCAWVGTILLWAMATDHWMQYRLSGS 

FAHQGLWRYCLGNKCYLQTDSIAYWNATRA 

FMILSALCAISGUMGIMAF/GWVAVLMTFFA 

GIFYMCAYRVHECRRLSTPR 


615 


1965 


A 


4590 


2 


414 


TILPEKIQAWAQKQCPQSGEEAVALWHLEK 
ETGRLRQQVSSPVHREKHSPLGAAWEVADFQ 
PEQ VETQPRA V SREEPG SLHSGHQEQLNRKR 
ERRPLPKNARPSPWVPALADEWNTLHQEVTT 
TRLPAGSQEPVKD 


616 


1966 


A 


4592 


773 


488 


DFALVAQAGVQWHNLGSPQPLPPGFKRFSCL 
SLPSSWEYRCVPP/RLANFVFLVEMGFLHVGQ 
AGLELPTSGDPPALASQSAGITGVTTVPSGPG 


617 


1967 


B 


4595 


84 


478 


XRHGLREPLLERRCAAASSFQHSSSLGRELPY 

DPVDTEGFGEGGDMQERFLFPEY1LDPEPQPT 

REKQLQELQQQQEEEERQRQQRREERRQQNL 

RARSREHPWGHPDPALPPSGVNCSGCGAEL 

HCQDAR* 


618 


1968 


A 


4596 


2945 


1188 


ARSRNSARGVYGMCVDTLFLCFLEDLERNDG 

SAERPYFMCSTLKKPLARRCFPAIHAYKGVL 

MVGNETTYEDGHGSRKNITDLVEGAKKANG 

VLEARQLAMRJFEDYTVSWYWmGLVIAMA 

M SLL S IILLHLL AGIMG WVMIIMEIVSELGYRIF 

HCYMEYSRLRGEAGSDVSLVDLGFQTDFRV 

YLHLRQTWLAFMHLSILEVHILLLIFLRKRJLI 

AIALIKEASRAVGYVMCSLLYPLVTFFLLCLCI 

AYWASTAVFLSTSNEAVYKIFDDSPCPFTAKT 

CNPETFPSSNESRQCPNARCQFAFYGGESGYH 

RALLGLQIFN AFMFF WLAN F VL ALGQVTL AG 

AFASYYWALRKPDDLPAFPLFSAFGRALRYH 

TGSLAFGALILAIVQIIRVILEYLDQRLKAAEN 

KFAKCLMTCLKCCFWCLEKFIKFLNRNAYIM 

IArYGTNFCTSARNAFFLLMRNlIRVAVLDKV 

TDFLFLLGKLLIVGSVGrLAFFFFTHRIRIVQDT 

APPLNYYWVP1LTVIVGSYLIAHGFFSVYGMC 

VDTLFLC FLED LERNDG S AERP YFM S STLKKL 

LNKTNKKAAES 


619 


1969 


A 


4601 


2 


357 


RTSVEPYILGEP/RKLSNNTKVVKTEYKATEY 
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nucleotide insertion 














GLAYGHFSYEFSNHRDVWDLQGWVTGNGK 

GLrYLTDPQIHSVDQKVFTTNFGKRGIFYFFN 

NQHVECNE1CHRLSLTRPSMEKPCKS 


620 


1970 


A 


4606 


1 


2415 


MERL WGLFQRAQQL S PRSSQTV YQR VEGPR 

KGHLEEEEEDGEEGAETLAHFCPMELRGPEP 

LGSRPRQPNLIPWAAAGRRAAPYLVLTALJLIF 

TGAFLLGYVAFRGSCQACGDSVLVVSEDVN 

YEPDLDFHQGRLY W SDLQ AMFLQFLGEGRL 

EDTIRQTSLRERVAGSAGMAALTQDIRAALS 

RQKLDHVWTDTHYVGLQFPDPAHPNTLHWV 

DEAGKVGEQLPLEDPDVYCPYSAIGNVTGEL 

VYAHYGRPEDLQDLRARGVDPVGRLLLVRV 

GVISFAQKVTNAQDFGAQGVLIYPEPADFSQ 

DPPKPSL SSQQA VYGH VH LGTGDP YTPGFPSF 

NQTQFPPVASSGLPSIPAQPISADIASRLLRKL 

KGPVAPQEWQGSLLGSPYHLGPGPRLRLWN 

NHRTSTPINNIFGCIEGRSEPDHYWIGAQRDA 

WGPGAAKS A VGTAILLEL VRTFS SMVSNGFR 

PRRSLLF1SWDGGDFGSVGSTEWLEGYLSVL 

HLKAVVYVSLDNAVLGDDKFHAXTSPLLTSL 

IE S VLKQVDSPNH S GQTLYEQ WFTNXPS WD\ 

AEVIRPLPM\DSSAY\SFTAFVGVPAVEFSFME\ 

DDO^AYPFLHTKEDTYENLHKVLQGRLPAVA 

QAVAQ T , AGQLLIRLS HDRLLPLDFGRYGD V V 

LRHIGNLNEFSGDLKARGLTLQWVYSARGDY 

IRAAEKLRQEIYSSEERDERLTRiVTYNVRIMRV 

EFYFLSQYVSPADSPFRHIFMGRGDHTLGALL 

DHLRLLRSNSSGTPGATSSTGFQ\ESRFRRQL\ 

ALLMWDACKGAANALSGDVWNrDNNF 


621 


1971 


A 


4610 


793 


334 


ISRVDDFVGSGIANVI1AVAIFSIPAFARLVRG\ 

NTLVLKQQTFIESARSIGASDMTVLLRHILPGT 

GSSIWFFTMRIGTSIISAASLSFLGLGAQPPTP 

EWGAMLNEARADMVTAPHVAVFPALAIFLTV 

LAFNLLGDGLRDALDPKIKG 


622 


1972 


A 


4614 


2 


820 


L VYVMIA I FC IA S AM S L YNCLAALIHKIPYGQ 

CTIACRGKNMEVRLIFL S GLCIAV A WW AVF 

RNEDRWAWTLQDILG1AFCLNLIKTLKLPNFK 

SCVILLGLLLLYDVFFVFITPFITKNGESIMVEL 

AAGPFGNNEKNDGNLVEATOQPSAPHEKLPV 

VIRVPKL1 YFSVMS VCLMPVS ILGFGDrrVPGL 

LLAYCRRFDVQTGSSYIYYVSWTVAYAIGMIL 

TFWLGVLMKKGQPALLYLVPCTLITA/CQFV 

AWETVREMKKFWERVTS 


623 


1973 


A 


4619 


17 


691 


TLVSVVEFVRRADLTREDLAPS S VDSGQAGF 
GGCCESGLPNTMPSAFSVSSFPVSIPAVLTQT 
DWTEPWLMGLATFHALCVLLTCLSSRSYRLQ 
3GHFLCL VIL VYCAEYINEAAAMNWRLFSK Y 
QYFDSRGMJ-ISIVFSAPLLVNAMIIVVMWVW 
KTLN VMTDLKNAQERRKEKKRRRKED* GAA 
AAWSLRPSRPPSAAPSAAVCVAWASFQLTHG 
LKNRCFI 


624 


1974 


A 


4622 


164 


668 


VSCYTALQSIMNQPESANDPEPLCAVCGQAH 
SLEENHFYSYPEEVDDDLICHICLQALLDPLD 
TPCGHTYCTLCLTNFLVEKDFCPMDRKPLVL 
QI ICKKS SIL VNKLLNKLL VTCPFREHCTQ VL 
QRCDLEHHFQTSQA WGTHL* SQLLGRLRQED 
CLSPGVHHCSEV 


625 


1975 


A 


4623 


474 


473 


CFLSPSPLLPPLLLSSSSSFSFPLPPPPTLLPSTLP 
PPLLIPSS'LSP 
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626 


1976 


A 


4629 


249 


3 


KLKGN ECFC YHCN VCIFLMIKX* GLFLC* I YF I 
LFFET* SHSFTRLECSGTIS AHCSLQLQGS SNSP 
ASASQVAGIAGTHH 


627 


1977 


A 


4635 


1 


301 


FFFFETKPFFAPQAGGQGPSRGSLNPLPTGLK 
QFSGLTT .SRSGNNGPRPPPRVKFGILRGNGVP 
PGG AG* PRPPDLRGPPGL APPQGGNNGGDPP 
ARAYL 


628 


1978 


A 


4648 


1357 


782 


KLFSSQRLFGPHIQAJNPSFLLLSFFPS*LLAMR 
TVGNNAFILVFLVYRIVLLLF*HV*PAYFQPSK 
NKTAKINCN*RPFLFLVCYLL*AELHIGIFIANF 
YDCIPNKLNEHLWPKLLQSLIFHVDFCGFLHK 
VFYICFTEFLLFLYFL*LFIIKVSCSII*CSTICVF 
SYKSFAVIIFFVDNTRFFSFGF 




1070 


A 
r\ 




is 


999 


HHFT HTI FI I OMPKFVT TR^FIODVNYSI FAV 

KVKTVCQIPLMKEMLKRFQVAVNLAEDTAH 

PKLVFSQEGRYVKNTASASSWPVFSSAWNYF 

AGWRNPQKTAFVERFQHLSCVLGKNVFTSG 

KHYWEVESRDSLEVAVGVCREDVMGITDRS 

KMSPDVGIWAIYWSAAGYWPLIGFPGTPTQQ 

EPALHRVGVYLDRGTGNVSFYSAVDGVHLH 

TFSCS S VS RLRPFFWLSPLASL VIPPVTDRK*G 

FS SPDQNSFP WQL RDTHPWALFCPS CL YPG 

WSIFWVSLTVPFGICPLCASQEAVPWEVGLA 

NGDGTGNFPRRFWEIFL 


630 


1980 




4669 


2 


358 


FFFFFETESHSVAOAGMOWRNLGSLPAPPPGF 
TPFFCLSLLNGWDYRRPPPHLANFFVLLVETG 
FHDVGQDGLDLLTS*STPSASQSAEITGVSHC 
TRLKKIRFAKGHVEFFFESHVE 


631 


1981 


A 


4674 


953 


614 


TPIRGTDDEHEECTVQEYSAGKNTCLRPGAV 
AHTCNPCTLGGRGRWIT*GSGVQDQPGPTWQ 
NPVFLERRPRALHSSPGLTTQRILWAQGLWV 
GAGSTGCSRGPRGEGVFREG 


632 


1982 


A 


4678 


34 


314 


RSTHASGMtSPSFGFMGHL-LRLEFEILPSTPNP 

♦LPSYQGEAAGSSLISHLQTFSPDLKGVYCTFP 

ASGLAPVPTHWTVSELSRSPVATATFC 


633 


1983 


A 


4696 


1 


1365 


RTLGMEGERRASQAPSSGLPAGGANOESPGG 

GAPFPGSSGSSALLQAEVLDLDEDEDDLEVFS 

KDASLMDMNSFSPMMPTSPLSMTNQIKFEDEP 

DLKDLFIl-VDEPESHVTTIETMTYRHTKTSRG 

EFDSSEFEVRRRYQDFL WLKGKLEE AHPTLII 

PPLPEKFIVKGMVERFNDDFIETRRKALHKFL 

NRIADHPTLTFNEDFKIFLTAQAWELSSHKKQ 

GPGLLSRMGQTVRAVASSMRGVKNRPEEFM 

EMNNFIELFSQKINLIDKJSQRJYKEEREYFDE 

MKEYGPIHILWSASEEDLVDTLKDVASCIDRC 

CKATEKRMSGLSEALLPVVHEYVLYSEMLM 

GVMKRRDQIQAELDSKVEVLTYKKADTDLL 

PEEIGKLEDKVECANNALKADWERWKQNM 

QNDIKLAFTDMAEENIHYYEQCLATWESFLT 

SQTNLHLEEASEDKP 


634 


1984 


A 


4708 


421 


158 


SYWVGEDYTYKFFEVILIDPFHKAIRRNPDTQ 
WISKAVYKHREMCGLTSTGRKSHGLEKDRM 
FPHAIGGSCRAA*RRRKTLQFPCYH 


635 


1985 


A 


4709 


42 


341 


YTKQPDAKERRRTVHWKKETESEASErnPPST 
PGVPQAPGHWED YGRGDNFYLPH • DPG GIVL 
WNIFNRMPIARKNrrDGEHHEYLIEVPRLFHT 
SED 


636 


1986 


A 


4721 


2 


351 


EKPDHFFPEGTSFIHEPRRPN*GDLVHCLGG1S 
RSTTVTV A* LMQKLNLSMNDA Y YT VTMKMSS 
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D^Aspartic Acid, E'KJlutamic Acid, 
F=Phenylalanine, G-Glycine, H=Histidinc, 
I=holcucine, K=Lysine, L=Leucine, 
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T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/—possible nucleotide deletion, V=possible 
nucleotide insertion 














ISPNFNSMDQPLDFQRTLGLRSPCYNRVPAQK 
M YFTTPSN HN A YQ VDS VQST 


637 


1987 


A 


4726 


664 


253 


NTGLTCSIQRKCGETQLYRREENRLILLLQDH 
LKSESFQVLTLSPRLEFSGLISAHCNLRLPGSS 
DSSASSSRAAGITGVHHHAWLIFFFLVETGFL 
H AG* AGLELLTSGDPP AS ASRS AGrTGV SHHA 
RPRETRFL 


638 


1988 


A 


4734 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDG 

TLERGHWNNKMEFVLSVAGEIIGLGNVWRFP 

YLCYKNGGGAFFIPYLVFLFTCGIPVFLLETAL 

GQYTSQGGVTAWRKlCPIFEGIGYASQMTvTL 

LNVYYII VL AW ALFYLF SSFT1DLP WGGC YHE 

WNTEHCMEFQKTNGSLNGTSENATSPVIEFW 


639 


1989 


A 


4743 


1040 


699 


QGLTLLPRMECSATITAHCSLELPGSIDLPTSA 
S*VARTTGTHHHPWL1LVLLL**TWGSYYVAQ 
AGLELLGSSNLPAAMVSQSAQIIGHDHCAWA 
TSNHVLYTQEGLRRGKEG 


640 


1990 


A 


4771 


527 


2 


G RIDCPHP ATVL A Q P EFED ACSVLG A Y QGA QN 

WIRRRPCLPSGCLKMNRE1GPLQHSLCCPGWS 

QTPGLKAEXRQPPK* LGLQMESHSCPPAWS A 

MARSRLTATSASQVQAILLPQPPGTTDSCSPS 

PDHEQQPLSWVLPPPQKDMNPRJEQQVALGP 

QAAALPWAVWRNDCFPR 


641 


1991 


A 


4780 


16 


473 


RPSSQCGGIPTGWKKGLAPELSSELSSPPLPAR 

LQLAASPYFSPSWAECPQPVPAGTHATWCLA 

RVWARMTPPGPAGIPSHPLPPPPPERSVPIPSP 

FPARDSGSRQGHSTDRYKHTDAPRDAHRRVP 

QRDTDTGVHTGSGTHTHAHTPPEK 


642 


1992 


A 


4798 


1 


487 


GYSFRCDIVDYSRSPTALRMARTCWLYYFSK. 
FIELLDTIFFVLRKKNSQVTFLHVFHHTIMPW 
-nVWFGVKFAAGGLGTFHALLNTAVHVVMY 
SYYGLSALGPAYQKYLWWKKYLTSLQLVQF 
VTV AIHI SQFFFMEDCK YQFP VF ACIIMSY SFM 
FLLLFLH 


643 


1993 


A 


4799 


2 


391 


LMAFIEMHISGSLVYLKIKTKJYSYFSMLNFLL 
QEIPLSEILRI S SPRDFTNI SQG SNPHCFEIITDT 
MVYFVGENNGDSSHNPVLAATGVGLDVAQS 
WEKAIRQALMPVTPQASVCTSPGQGKDHSK 
Q» ASVCTSPGQGKDHSKQ 


644 


1994 


A 


4800 


488 


101 


AYPLFAVHPVHTECVAGVVGRAYLLCALFFL 
LSFLG YCKAFRESNKEG AHSSTFW VLLS IFLG 
AVAMLCKEQG1TVLVRAATWLGPAFSVCPFP 
SYKDIWGWPCLCGVLHAYIPLLV 


645 


1995 


A 


4805 


458 


126 


LLWTTVLCQTPARPQSTMIHLGHILFLLLLPV 
AAAQTTPGERSSLP AFYPGTSGSCSGCG SLSL 
PLLAGLVAADAVASLLIVGAVFLCARPRRSP 
AQEDGKVYINMPGRG 


646 


1996 


A 


4817 


47 


1033 


LQGDT WHLSFLSHFSRLHGGVPGRGI ,LEGNI , 

LQPQAPGHDMTSrPFPGDRLLQVDGVILCGLT 

HKQAVQCLKGPGQVARLVLERRVPRSTQQC 

PSANDSMGDERTAVSLVTALPGRPSSCVSVT 

DGPKF* SSN*KR1ANGLGFSFVQMEKESCSHL 

KSDLVRIKRLFPGHPAEENGAIAAGDIILGRE 

WEGPRKASSSRCRGSWAMQLSVQAGPSFAS 

YYPAAVEVLHLLRGAPQEVTLLLCRPPPGAL 

PELEQEWQTPELSADKEFTRATCTDSCTSPIL 

GSRGQLGOTVPPQMQGKAWGLRPESSQKAIR 

EGTMGAKTERDLGPVP 


647 


1997 


A 


4854 


1044 


335 


PR VRGDWPLEKKKSN SNIHPIFS WCGSTD SKD 
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Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/—possible nucleotide deletion, \— possible 
nucleotide insertion 














rVMPTYDLTDSVLETMGRVSLDMMSVQANT 

GPPWESKNSTAVWRGRDSRKERLELVKLSRK 

HPELIDAAFTNFFFFKHDENLYGP1VXHISFFD 

FFKHKYQIN1DGTVAAYRLPYLLVGDSVVLK 

QDSIYYEHFYNELQPWKHYIPVKSNLSDLLEK 

LKWAKDHDEEAKKIAKAGQEFARNNLMGD 

DI F CY YF QTFPRNMP I Y K 


648 


1998 


A 


4867 


2030 


837 


AGMLPAVGSADEEEDPAEEDCPELVPMETTQ 

SEEEEKSGLGAKIPVTIITGYLGAGKTTLLNYl 

LTEQHSKRVAVILNEFGEGSALEKSLAVSQG 

GELYEEWLELRNGCLCCSVKDNGLRAIENLM 

QKKGKFDYILLETTGLADPGAVASMFWVDA 

ELGSDIYLDGIITIVDSK.YGLKHLAEEKPDGLI 

NEATRQVALADAILINKTDLVPEEDVKKLRT 

TIRSINGLGQILETQRSRVDLSNVLDLHAFDSL 

SGISLQKXLQHVPGTQPHLDQSIVTITFDVPG 

NAKEEHLNMFIQNL.LWEKNVRNKDNHCMEV 

IRLKGLVSIKDKSQQVIVQGVHELYDLEETPV 

SWKDDTERTNRLVLLGRNLDKDILKQLFIAT 

VTETEKQWTTHFKEDQVCT 


649 


1999 


A 


4873 


226 


189 


DG VSLLLPKLG VQ W AQY W AHWQ PPLPGFKR 
FSCLSLRSSWD*KCAPPHPAFVFLVEMGFHRV 
GQAGLELRTSGDPPAS A SQS AGITG VS FTI A * P 
TSMPLLPFQRIXVYI 


650 


2000 


A 


4874 


2 


437 


FFFLRRSFAFVAQAGVQWCDLGSPQPLPPGF 
K*FSCLSLPSSWDYRHAPPPCPS*FLYF**RQG 
FTMLARLVLNS+PHDLPTSPSQSAEIKGVSHR 
CPASFYLFLKYYLEAKFCA*GECAPSAGVGA 
GYKRGHKSCLL INC WQI 


651 


2001 


A 


4898 


1701 


771 


DA WGPETRL ARILNPD SFIEPRPGRLPELEATR 

PHMEPKASCPAAAPLMERKFHVLVGVTGSV 

AALKLPLLVSKLLDIPGLEVAVVTTERAKHFY 

SPQDIPVTLYSDADEWEMWKSRSDFVLHIDL 

RRWADLLLVAPLDANTLGKVASGICDNLLTC 

VMRA WDRSKPLLFCPAMKTAM WEHPITAQQ 

VDQLKAFGYVEIPCVAKKLVCGDEGLGAMA 

EVGTIVDKVKEVLFQHSGFQQS*PGISVMGVP 

LYSEWVQAKSVKMDVGKIGGYPHLLNGGPA 

LSLPRGQACSRLNWTEGPGLSFFQPGEAAA 


652 


2002 


A 


4927 


1 


611 


FRGRQTSRPARGFSPWRPPGTMQEPSSGECPA 

SP*LPCASNRLAFGGLIFPCAPLVPYPAPFSPLL 

PAFSCAPRPRAHTHSRTHPSAPLVPKPSSRAR 

G QSPIPSRASSPSC S WAQVPG V AL ARCAG VC 

KPGDSWRVAACISQRCCSRGRRRGSGPRNPE 

QSFRGAWGPSFWGSWKSQRELSAGGAQAWP 

LLGSAGSGLRGEA 


653 


2003 


A 


4965 


2 


283 


FFFFI*DGVSLCHPGWNAVARSWLtATSASR 
VQAVSCFRLPS S WD YRHATMPG* FF* YF** R 
WGFTrLAILVLNS*PQVICPPWPPKVLTLQA 


654 


2004 


A 


4968 


3 


437 


RPGIPGRRFRRS WFCQLP* EPEPGLESL ATPGD 

IPAVGLGALGVIPPVRVPQRPPTQRSQGRGW 

DPERDPGCRVQVSRGPRFGEQKTPGLQGCLP 

PPCLTHLAAASCVVVWCGRWKRDSAECQCD 

HSCSAVSQQEDRCRSSSCS 


655 


2005 


A 


4983 


201 


397 


MNNNTTCIQPSM1S SMALPIFYILLCIVGVFGN 
TLSQWIFLTKIGKKTSTHIYLSHLVTANLLVC 


656 


2006 


A 


4988 


332 


159 


LVHKDMYREFFEEEAQASNKHVTRCLTSLVI 
RE VHIKTMR* HFIJ'IRLEKNKNNIKD 


657 


2007 


B 


5008 


129 


465 


MAGMKTASGDYIDSSWELRVFVGEEDPEAES 
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D=Aspartic Acid, E-Glutamic Acid, 
F=Phcnylalanine f G=Glycine, H=Histidine, 
I=lsoleucine f K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=<Hutamine, R— ArginLne, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *-Stop codon, 
/=possiblc nucleotide deletion, V=possible 
nucleotide insertion 














VTLRVTGESHIGGVLLKIVEQINRKQDWSDH 

AIWWEQKRQWLLQTHWTLDKYGILADARLF 

FGPQHRPVILRLPNRRALRLX* 


658 


2008 


A 


5017 


1 


292 


FFFFKETESHSVTQAGVQWHDLGSLQPPPPGF 
KRFSCLSLLSSWDYRCAPPHPANFVFLVETGF 
HHVAQAGLKLLTL*SANLGLSTSLPIPLFILLS 


659 


2009 


A 


5018 


17 


338 


RGHGGKSLTGGTPGNWGDGLLVSEDWSHLIF 
T*NSLVSPVLGKWSPCLQGPGLSAVHTWPWL 
MAAC W A VH VKTHMRPGLA VLPRL VLNS WS 
* A1ILL WPP KAL G LQ A 


660 


2010 


A 


5028 


2 


310 


SRVDDFVGERRGGCDECLCGHRGLRAVPLG 
HPGHLCLQPPGGPA*FLDYCRGCCPHPVPGST 
AGSCPRQKKTTPGPT VLCV CSFWIYQRGEPH 
HRTGARWNH 


661 


2011 


A 


5050 


752 


431 


RQSCSSTQAKVQWFHYGPLQSQPPGLKQSSQ 
LSLFNSRJDHRHVPPRLAIFSFAETGSPYFAQAS 
LELLG S SHPPTS ASQS AR1TG V SHRAWPLK* F 
NLNQYQTLTMN 


662 


2012 


A 


5054 


48 


103 


ELNNGPFQMPLCNGGNLAVTGSWADRSPLH 

EAASQGRLLALRTLLSQGYNVNAVTLDHVTP 

LHEACLGDHVACARTLLEAGANVNATriDGV 

TPLFNACSQGSPSCAELLLEYGAQAQLESCLP 

SPTHEGASKGHHECLDILISWG1DVDQEIPHSG 

TPLYVACMAQQFHCIWNLIYAGAGVRKGKY 

WDTPLPGAGHQSTQKLE * LFAMVE1WQ 


663 


2013 


A 


5066 


951 


580 


VRNS * SF AHC AS VYKHH YMDGQTPCLFVSSIC 
ADLPEGVAVSGPSPAEFCRKHRLPAPVPFSCA 
GPAEPSTTIFTQLATMAAFPHLVHAELHPSSF 
WLRGLLGVVGAAVAAVLSFSLYRVLVKSQ 


664 


2014 


A 


5071 


550 


1 


LSFTEVLSMEQVNKTVVREFVVLGFSSLARLQ 
QLLFVIFLLLYLFTLGTNAIIISTIVLDRALHTP 
MYFFLAIL SCSEICYTFVI VPKMLVDLLSQKK 
TISFLGCAIQMFSFLFFGSSHSFLLAAMGYDR 
YMA1CNPLRYSVLMGHGVCMGLMAAAWAC 
GFTVSL VTTSL VFI TLPFHSSNQHE 


665 


2015 


A 


5074 


496 


692 


QQYHNTGSAGHHAHCQVGHSPHVHYPSGCG 

PL*IQRGLPSFNSLEGHSLKDSGHEESVQLDSE 

HDVQRSLYCDTAVNDVLNTSVTSMGSQMPD 

HDQNEGFHCREECRn .GHSDRCWMPRNPMPI 

RSKSPEHVRNIIALSIEATAADVEAYDDCGPT 

KRTFATFGKDVSDHPAEERPTLKGKRTVDVT 

ICSPKVNSVIREAGNGCEAISPVTSPLHLKSSL 

PTKPSVSYEIVDPGITARRC 


666 


2016 


A 


5080 


408 


248 


IMLLSTSS*VYFQSSTKDSHFFLFDFQKTGPPL 
VGPKAQLSGLQLQPCLYKRR 


667 


2017 


A 


5081 


129 


247 


DLTNSHFFLFDFQKTGPPLGGPKAQFSSLQLQ 
PCVY*RR 


668 


2018 


A 


5086 


852 


233 


NIKSNDRWVQIKTAVKYFF*KNGDNYNWVF 

RALPTTFADIENLKYLLFTRDASQPFYLGHTV 

IFGDLJEYVTVEGGrVLSRELMKRLNRLLDNSE 

TCADQSVIWKLSEDKQLAICLKYAGVHAENA 

EDYEGRDVFNTKPIAQ1.IREALSNNPQQVVEG 

CCSDMA1TFNGLTPQICMEVMMYGLYRLRAF 

GHYFNDTLVFLPPVGSEND 


669 


2019 


A 


5101 


1 


329 


PGRPTRPPLLTLLAHVSPEPAG PSCDSL AQPG 
ASGV*VQHDSHPPLLCGSQCLSEPVPGSHGPP 
RGCQHE AAPCPRGPG SDGLHH A S A A C AS LPP 
SPILPVLLPELGPL 


670 


2020 


A 


5102 


3 I 547 


DAWGNRCAVGAAPRLIHLHLCCTPADPSRKP 
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Y-Tyrosine, X-Unknown, *-Stop codon, 
^possible nucleotide deletion, \=possiblc 
nucleotide insertion 














DEL*NMNGRVDYLVTEEEINLTRGPSGLGFNI 

VGGTDQQYVSNDSGIYVSRIKENGAAALDGR 

LQEGDKILSVNGQDLKNLLHQDAVDLFRNA 

GYAVSLRVQHRLQVQNGPIGHRGEGDPSGIPI 

FMVLVPVFALTMVAAWAFMRYRQQL 


671 


2021 


A 


5105 


672 


400 


RDGREELCLQQEPTLPSRICSSAPLLYFLFICPF 
VLLLLLLISLIXLYWKARKLSTLRSNTRKEKA 
LWVDLKEAGGVTTKRMED*EEDECN 


672 


2022 


A 


5148 


72 


314 


1IYFSYNIFLKITELLNDVERLKQALNGLSQLT 
YTSGNPTKRQSQLIDTLQHQVKSLEQQLAVS 
NQAHGALQEYVLAPCS 


673 


2023 


A 


5152 


210 


335 


REILCSRJGRLNIV+MSLFPNLTCRLNAIPIKIPA 
NHFVEVT 


674 


2024 


A 


5153 


3 


2953 


LTEDQPFDILQKSLQEANITEQTLAEEAYLDA 

SIGS SQQF AQ AQLHPS SSASFTQA SNVSNYSG 

QTLQPIGVTHVPVGASFASNTVGVQHGFMQH 

VGISVPSQHLSNSSQISGSGQIQLIGSFGNHPS 

MMTINNLDGSQIILKGSGQQAPSNVSGGLLV 

HRQTPNGNSLFGNSSSSPVAQPVTVPFNSTNF 

QTSLPVHNmQRGLAPNSNKVPINIQPKPIQM 

GQQNTYNVNNLGIQQHHVQQGISFASASSPQ 

GS WGPHMS VN1 VNQQNTRKPVTS Q A V SSTG 

GSrVTHSPMGQPFLAPQSQFLIPTSLSVSSNSVH 

HVQTINGQLLQTQPSQLISGQVASEHVMLNR 

N S SNMLRTNQP YTGPMLNNQNT A VFLL V S G Q 

TFAASGSPVIANHASPQLVGGQMPLQQASPT 

VLHLSPGQSSVSQGRPGFATMPSVTSMSGPSR 

FP A VS S AST AHPSLGS AVQS GSSGSNFTGDQL 

TQPNRTPVPVSVSHRLPVSSSKSTSTFSNTPGT 

GTQQQFFCQAQKKCLNQTSPISAPKTTDGLR 

QAQIPGLLSTTLPGQDSGSKVISASLGTAQPQ 

QEKWGSSPGHPAVQVESHSGGQKRPAAKQ 

LTKGAFTLQQLQRDQAHTVTPDKSHFRSLSD 

AVQRLLSYHVCQQSMPTEEDLRKVDNEFETV 

ATQLLKRTQAMLNKYRCLLLEDAMRINPPAE 

M VMIDRMFNQEERASL SRDKRLAL VDPEGFQ 

ADFCCSFKLDKAAHETQFGRSDQHGSKASSS 

LQPPAKAQGRDRAKTGVTEPMNHDQFHLVP 

NHIWSAEGNISKKTECLGRALKFDKVGLVQ 

YQSTSEEKASRREPLKASQCSPGPEGHRKTSS 

RSDHGTESKLSSELADSHLEMTCNNSFQDKSL 

RNSPKNEVLHTDIMKGSGEPQPDLQLTKSLET 

TFKNILELKKAGRQPQSDPTVSGSVELDFPNF 

SPMASQENCLEKFIPDHSEGVVETDSILEAAV 

NSILEC 


675 


2025 


A 


5154 


599 


1880 


LKKMEPFSCDTFVALPPATVDNRIIFGKNSDR 

LYDEVQEVVYFPAWHDNLGERLKCTYTEID 

QVPETYAWLSRPAWLWGAEMGANEHGVCI 

GNEAVWGREEVCDEEALLGMDLVRLGLERA 

DTAEKALNVTVDLLEKYGQGGNCTEGRMVF 

S YHN SFLIADRNEA WILETAGKYW AAEKVQE 

GVRMSNQLSriTKIAREHPDMRNYAKRXGW 

WDGKKEFDFAAAYSYLDTAKMMTSSGRYCE 

GYKLLNKHKGNITFETMMEILRDKPSGINME 

GEFLTTASMVFTLPQDSSLPCrHFFTGTPDPER 

SVFKPFIFWHISQIXDTSSPTFELEDLVKKKS 

HFKPDRRHPLYQKHQQALEVVNNNEEKAKJ 

MIJ)NMRKLEKELFREMESrLQNKHLDVEKJV 

NLFPQCTKDEIQIYQSNLSVKVSS 
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676 


2026 


A 


5155 


2 


306 


FFFLRRSLALSPRPDCGLQWRNLGSLQAPPPG 
FTPFSCLSLPSSWDYRRPPPRPANFLYF**RRG 
FTLLARMVSIS*PHDPPASASQSAGITGVSHRA 
RPT 


677 


2027 


A 


5167 


97 


740 


FFHSVDLLALEQSKTFYKPDWFDIVESEVKCC 

KEAVCV1DMSSFTEFEITSTGDQALEVLQYLF 

SNDLDVPVGH1VHTGMLNEGGGYENDCSIAR 

LNKRSFFMISPTDQQVHCW'AWLKKIIMPKDS 

NLLLEDVTWKYTALNLIGPRAVDVLSELSYA 

PMTPDHFPSLFCKEMSVGYANGIRVMSMTHT 

GEPGFMLYIPIEYRWGFTMLSTLVSNS 


678 


2028 


A 


5183 


1919 


2018 


P ALCRLRDDMT VC V ADFGLSKKI YSGD YYRQ 
GRIAKMPVKWIAIESLADRVYTSKSDVWAFG 
VTMWEIATRGMTPYPGVQNHEMYDYLLHG 
HRLKQPEDCLDELCKI**SPQSP 


679 


2029 


A 


5190 


39 


499 


RESQVKHFKMRKIDLCLSSEGSEVILATSSDE 
KlffPF^IIDGNPETFWTTTGMFPQEFIICFHKH 
VRIERL V I QS YFVQTLKIEKSTSKEP VDFEQWI 
EKDLVHTEGQLQNEEIVAHDGSATYLRFIIVS 
AFDHFASVHSVSAEGTWSNLSS 


680 




A 


5204 


541 


92 


FII AVI KLACGDISLNALALMVATAVLTLAPL 
LUCLSYLFILSAJLRVPSAAGRCKAFSTCSAH 
RTVVWFYGTISFMYFKPKAKDPNVDKTVAL 
FYG VVTPSLNPIIYSLRN AEVKAA VLTLLRG G 
LLSRKASHCYCCPLPLSAGIG 


681 


2031 


A 


5207 


10 


247 


VPDNGDVTKLPVCSTLVEETSLTVSEAMEQSI 
KNESPLPGTLAHTCNTSTLGGRGRWIT*GREF 
DTSM AN M VKPCLYRK 


682 


2032 


A 


5210 


2 


231 


FFFETESYSITQAGVQWPNLSSLKTLPPGFK*F 

SCLSLPSSWDYRCLPPCPANFCIFSRNGVLPC 

WPGWSRTPDLS 


683 


2033 


A 


5218 


85 


402 


CPS V SGLIKSDLRRHNINIGITNVDVKAV SNIP 
M I1LLRSMYRINVKP YFFI*LFFSRVNC* S VUG 
YARCYTFL1F+LFL* IPADSPTDQEPKTVMLSK 
QSESA1 


684 


2034 


A 


5220 


1 


194 


NLMKEMQNLNSENHKTWEEYKDTK* IMS YF 
YG*ALr>TVTKMAVLPKLMYRFSATLVKlPQHL 
TDS 


685 


2035 


A 


5228 


260 


440 


LHSQDGNSDPRKPQGEMSAHAFPVQTCGEED 
QKKTPQVPINFTELSKCS * S* KIMSGERE 


686 


2036 


A 


5239 


79 


508 


GGEAAARAAKLSSPRPHRVGRRERGVGGMS 
AFSEAALEICiCLSELSNSQQSVQTLSLWLIHHR 
KJiSRPIVTV^VERELRJCAKPNRKLTFLYLAND 
VIQNSKRKGPEFTKDFAPVIVEAFKHVSSETD 
ESCKKHLGRVLSIWEERS 


687 


2037 


A 


5244 


1 


428 


MAAWAATALKGRGARNARVLRGILAGATA 
NKASHNRTRALQSHSSPEGKEEPEPLSPELEYI 
PRKRGKNPMKAVGLAWAJGFPCGILLFILTKR 
EVDKDRVKQMKARQNMRLSNTGEYESQRFR 
ASSQSAPSPDVGSGVQT 


688 


2038 


A 


5249 


1 


1407 


LQQTEDKSLLNQGSSSEEVAGSSQKMGQPGP 

SGDSDLATALHRLSLRRQNYLSEKQFFAEEW 

QRKJQVLADQKEGVSGCVTPTESLASLCTTQS 

ErmLSSASCLRGFMPEKLQIVKPLEGSOTLY 

HWQQLAQPNLGTILDPRPGVITKGFTQLPGD 

AIYHISDLEEDEEEGrrFQVQQPLEVEEKLSTS 

KPVTGIFLPPrrSAGGPVTVATANPGKCLSCT 

NSTETFTTCRTLHPSDITQ VTPSSGFPSLSCG S S 

GS S SSNTA VNSP ALA YRLSIGESITN RRD STTT 
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FSSTMSLAKLLQERGISAKVYHSPISENPLQPL 

PKSLAIPSTPPNSPSHSPCPSPLPFEPRVHLSEN 

FLASRPAETFLQEfcfYGLRPSRNPPDVGQLKM 

NLVDRLKRLGIARVVKNPGAQENGRCQEAE] 

GPQKPDSAVYLNSGSSLLGGLRRNQSLPVIM 

GSFAAPVCTSSPKMGVLKED 


689 


2039 


A 


5254 

1 


2 


262! 


LSLFGSRALGRSGARAMAKAKKVGARRKAS 

GAPAGARGGPAKANSNPFEVKVNRQKFQILG 

RK.TRHDVGLPGVSRARALRKRTQTLLKEYKE 

RDKSh^TOKJ^GEYNSNMSPEEKMMKRFA 

LEQQRHHEKKSIYNLNEDEELTHYGQSLADIE 

KHNDIVDSDSDAEDRGTLSGELTAAHFGGGG 

GLLHKKTQQEGEEREKPKSRKELIEELIAKSK 

QEKRERQAQREDALELTEKLDQDWKEIQTXL 

SHKTPKSENRDKKEKPKPDAYDMMVRELGF 

EMKAQPSNRMKTEAELAKEEQEHLRKLEAE 

RLRRMLGKDEDENVKKPKHMSADDLNDGFV 

LDKDDRRLLSYKDGKMNVEEDVQEEQSKEA 

SDPESNEEEGDSSGGEDTEESDSPDSHLDLES 

NVESEEENEKPAKEQRQTPGKGLISGKERAG 

KATRDELPYTFAAPESYEELRSLLLGRSMEEQ 

LLWERIQKCNHPSLAEGNKAKLEKLFGFLLE 

YVGDLATX>DPPDLTVfDKLVVHLYHLCQMFP 

ESASDAIKFVT,ROA MHFMFPNjfTPTTrrtj a at t> 

GLDVLIYLKITGLLFPTSDFWHPVVTPALVCL 
S QLLTKCP1LSLQD WKGLF VCCLFLE YVALS 
QRFIPELINFLLG1LYI ATPNKASQGSTL VI IPFR 
AEGKNSELLVVSAREDVATWOOSSI sirwa 
SRLRAPTSTEANHIRLSCLAVGLALLKRCVLM 
YGSLPSFHAIMGPLRALLTDHLADCSHPQELQ 
ELCQ STLTEMESQKQLCRPLTCEKSKPVPLKL 
FTPRLVKVLEFGRKQGSSKEEQERKRLIHKHK 
REFKGAVREIRKDNQFXARMQLSEIMERDAE 

rkrkvkql™slatqegewkalkrkkfkk 


690 


2040 


A 


5261 


1 


304 


FFFFVFLVETGFHHVGQAGLELLTSGDPPTW " 
ASQSAGJTGVSHCSWPVIYVLSTLLHAVRKVL 
FKRTFPLKSSSFLSYDKE1FP1UVLKFYLVTLT 
SFVK 


691 


2041 


A 


5270 


3 


158 


NCHTTHCTAN WVlILrcTPPGWKIDGPAAAL " " 
EVLSSFFFFFLKFSYKPQNTV 


692 


"2042 


A 


5282 


56 


1268 


GME P VGCCG ECRGSS VDPRSTFVLSNL AE W 
ERVI.TFLPAKALLRVACVCRLWRECVRRVLR 
THRS VT WIS AG LAE AG HLEG HCL VR W AEFJ 
ENVRILPHTVLYMADSETFISLEECRGHKRAR 
KJITSMETALALEKLFPKQCQVLGIVTPGIVVT 
PMGSGSNRPQEIEIGESGFALLFPQIEGIKIQPF 
HFIKDPKNLn.. F.RHQLTE VGLLDNPELRWL V 
FGYNCCK VGA SN YL QQ WSTFSDMNHLAGG 
QVDNLSSLTSEKNPLDIDASGWGLSFSGHRI 
ybAl VLLNtDVbDEKTAEAAMQRLKAANIPE 
I rNTIGFMFACVGRGFQYYRAKGNVEADAFR 
KFFPS VPLFGFFGNGEI GCDRJVTGNFILRKCN 
EYTCDDDLFHSYTTIMALIHLGSSK 


693 


2043 


A 


5301 


362 


507 


EEIKERFGPGLVIYWYGFIQELDCNRERGILLK 
ACFPTNrVTLCHSIA 


694 

695 | : 


2044 
1045 


A 

\ 


5310 
5315 


1 

125 


204 j 
1 
1 

1596 1 


RYLTAJNHTLKENLRKFYKGKKDKPLDLRPK 
iCTRAMRJIRLNMFiEENLKTKKQHRKERLYPL 
UCYAAKA 

araSTAVKSEVQVCJSLLLCLEDRTMPKKAKP " 
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M=Methionine, N=Asparaginc, P=Prolinc, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine ) V=Valine, W=Tryptophan, 
Y=Tyrosine, X~Unknown, *=Stop codon, 
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nucleotide insertion 














TG SGKEEGP APCKQMKLEAAGGPS ALNFD SP 

SSLFESLISPIKTETFFKEFWEQKPLLIQRDDPA 

LATYYGSLFKLTDLKSLCSRGMYYGRDVNV 

CRCVNGKKKVLNKDGKAHFLQLRKDFDQKR 

ATIQFHQPQRFKDELWRIQEKLECYFGSLVGS 

NVYITPAGSQGLPPHYDDVEVFILQLEGEKH 

WRLYHPTVPLAREYSVEAEER1GRPVHEFML 

KPGDLL YFPRGTIHQ ADTPAGL AH STHVTI ST 

YQNNSWGDFLLDTISGLVFDTAKEDVELRTG 

IPRQLLLQVESTTVATRRLSGFLRTLADRLEG 

TKELLSSDMKKDF1MHRLPPYSAGDGAELSTP 

GGKLPRLDSVVRLQFKDHIVLTVLPDQDQSD 

ETQEKMVYIYHSLKNSRETHMMGNEEETEFH 

GLRFPLSHLDALKQIWNSPAISVKDLKLTTDE 

EKESL VL SL WTECLIQ W 


696 


2046 


A 


5318 


1476 


742 


LMKXYLEAAELGEISDIHTKLLRLSSSQGT1ET 

SLQD1DSRLSPGGSLADAWAHQEGTHPKDRN 

VEKLQVLLNCMTEIYYQFICKDKAERRLAYN 

EEQrHKFDKQKLYYHATKAMTHFTDECVKK 

YEAFLNKSEEWIRKMLHLRKQLLSLTNQCFDI 

EEEVSKYQEYTNELQETLPQKMFTASSGIKHT 

MTPIYPSSNTLVEMTLGMKKLKEEMEGVVKE 

LAENNHILESGGSLTMDGGLRNVDCL 


697 


2047 


A 


5320 


244 


478 


LDYNFFLFEMTFGLVSQAGVQWHDLGSLQPP 
PPGFKQFSCLSLPSSWDYRHLPPHLANFSREG 
VSPSWPGWSRTPDFR 


698 


2048 


A 


5324 


266 


714 


LPIRKSLRS VRS GFPTSQSPITRN LD GT AS G S C 
LAKTVTGSLFRINVGLRGLVAGGIIGAI .1 .GTP 
VGGLLMAFQKY SGETVQERKQKDRKALHEL 
KLEEWKGRLQVTEHLPEKIESSLQEDEPENDA 
KKIEAIXNLPRNPSVIDKQDKD 


699 


2049 


A 


5334 


699 


277 


RPHGHLVCISSSAGLSGVNGLADYCASKFAA 

FGFAESVFVETFVQKQKGIKTTIVCPFFIKTGM 

FEGCTTGCPSLLPILEPKYAVEKIVEAILQEKM 

YLYMPKLLYFMMFLKSFLPLKTGLLIADYLGI 

LHAMDGFADQKK 


700 


2050 


A 


5344 


3 


614 


FTAEEMSSLTPESSPELAKRSWFGNFISLDKEE 

QIFLYLKDKPLS SIKADI VHAFL SIPSLSHS VLS 

QTSFRAEYKASGGPSVFQKPVRFQVDISSSEG 

PEPSPRRDGSGGGGIYSVTFTLISGPSRRFKRV 

VETIQAQLLSTHDQPSVQALADEKNGAQTRP 

AGAPPRSLQPPPGRPDPELSSSPRRGPPKDKK 

LLATNGTPL 


701 


2051 


A 


5346 


3 


1383 


HASVLFCRVMAASKTQGAVARMQEDRDGSC 

STVGGVGYGDSKDCILEPLSLPESPGGTTTLE 

GSPSVPCIFCEEHFPVAEQDKLLKHMIIEHKIV 

IADVKLVADFQRYILYWRKRFTEQPITDFCSV 

nUNSTAPFEEQENYFLLCDVLPEDRILREELQ 

KQRLREILEQQQQEKNDTNFHGVCMFCNEEF 

LGNRSVILNHMAREHAFNIGLPDNIVNCNEFL 

CTLQKXLDNLQCLYCEKTFRJDKNTL1CDHMR 

KKQHRKINPKNREYDRFYVINYLELGKSWEE 

VQLEDDRELLDHQEDDWSDWEEHPASAVCL 

FCEKQAETIEKLYVHMEDAHEFDLLKIKSELG 

LNFYQQVKLVNFIRRQVHQCRCYGCHVKFKS 

KADLRTHMEETKHTSLLPDRKTWDQLEYYFP 

TYENDTLLWTLSDSESDLTAQEQNENVPIISE 

DTSKLYALKQSSILNQLLL 


702 


2052 


A 


5356 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWD | 
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LASLRCTLGAFCECDFRPDLPGLECDLAQHL 
AGQHLAKALWKALKAFVRDPAPTKPLVLSL 
HGWTGTGKSYVSSLLAHYLFQGGLRSPRVH 
HFSPVLHFPHPSHIERYKKDLKSWVQGNLTA 
CGRSLFLFDEMDKMPPGLMEVLRPFLGSSWV 
VYGTN YRKAIFIFI SNTGGEQrNQVALE A WRS 
RRDREEILLQELEPVISRA VLDNPI IHGFSNSGI 
MEE RLLD A V VPFLPL Q RHHVRH C VLNEL A Q L 
GLEPRDE WQ AVLDSTTFFPEDEQLF S SNGCK 
TVASRLAFFL 


703 


2053 


A 


5380 


278 


657 


LFLQKLRMKTEEE ART! riFIEMFLRKEQQKL 
EERLEFWMEKYDKDTEMKQNELNALKATKA 
SDL AHL QDLAKMIRE YEQ VIIEDRIEKERSKX 
KVKQDLLELKSVIKLQAWWRGTMrRREIGGF 
KM 


704 


2054 


\ 


5381 


i 


1003 


FRGRAVKMAAVVEVEVGGGAAGERELDEV 

DMSDLSPEEQWRVEHARMHAKHRGHEAMH 

AEMVLILIATLWAQLLLVQWKQRHPRSYN 

MVTLFQMWVVPLYFTVKLHWWRFLVIWILF 

SAVTAFVTFRATRKPLVQTTPRLVYKWFLLIY 

K1SYATGIVGYMAVMFTLFGLNLLFK1KPEDA 

MDFGI SLLFYGLYYGVLERDF AEMC AD YMA 

STIGFYSESGMPTKHLSDSVCAVCGQQIFVDV 

SEEGnF>JTYRI,SCNHVFHEFCIRGWCIVGKK 

QTCPYCKEKVDLKRMFSNPWERPHVMYGQL 

LDWLRYLVAWQPVIIGWQGINYILGLE 


705 


2055 


A 


5396 


3 


675 


IYDRDPLQLATRAGQPLDINMAGEPKPYRPKP 

GNKRPLSALYRLESKEPFLSVGGYVFDYDYY 

RDDFYNRLFDYHGRVPPPPRAVIPEKRPRVA 

VTTrRRGKGVFSMKGCSRSTASGSTGSKLKS 

DELQTIKKELTQIKTKID SVLGRLDKIEKQQK 

AEAEAQKKLLEESLVLIQEECVSEIADHSTEEP 

AEGGPDADGEEMTDGIEEAFDEDGGHELFLQ 

IK 


706 


2056 


A 


5410 


2 


98 


GRVGLNLEGRGCSEPKWRHCTPTWATEQDSI 
S 


707 


2057 


A 


5415 


6 


287 


PFKLTPSFLSHAFSSGQERKVFIELNHIKKCNT 
VRGVFVLEEFGNYTILLLGLDSHGSNSNLGAP 
EEGLGAGRKRTSVFJCSGGAGVTRKXRDP 


708 


2058 


A 


5423 


3 


291 


SSSNPLGSPSTLWKLCSFVLHNKSCCCSFFGS 

TPTUIAITLTVRVCGF3PEVSKTTNPLGRTNNS 

GCTEFKTVTLTARSTASLLKSVRPRTHQKE 


709 


2059 


A 


5424 


679 


347 


RIRHEEKRGSRGRGRRTSEEDTPKKKKHKGG 
SEFTDTILSVHPSDVLDMPVDPNEPTYCLCHQ 
VSYGEMIGCDNPDCPIEWFHFACVDLTTKPK 
GKWFCPRCVQEKRKKK 


710 


2060 


A 


5442 


1073 


559 


QESLKKKIQPKLSLTL SSS VSRGNV STPPRHS S 

GSLTPPVTPPITPSSSFRSSTPTGSEYDEEEVDY 

EESDSDESWTTESAISSEAILSSMCMNGGEEK 

PFACPVPGCKKRYKNVNGIKYHAKNGHRTQI 

RVRKPFKCRCGKSYKTAQGLRHHTINFHPPV 

SAE1IRKMQQ 


711 


2061 


A 


5449 


1 


319 


GDSLCWQYNKYREERVILFLKMASGHAFQP 
DLVKRIRDAIRMGLSARHVPSLILETKGIPYTL 
NGKKVEVAVKQIIAGKAVEQGGAFSNPETLD 
LYRDIPELQGF 


712 


2062 


A 


5499 


91 


749 • 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASAL 
QVRGEALSEEEI WSLLFLAAEQLL EDLRNDS S 
DYWCPWSALLSAAGSLSFQGRVSHIEAAPF 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














KAPELLQGQSEDEQPDASQMHVYSLGMTLY 
WSAGFH VPPHQPLQLCEPLHSIL LTMCEDQPH 
RRCTLQSVLEACRVHEKEVSVYPAPAGLHIR 
RLVGLVLGTISEVSREPCFSSSSCWSCVAIKI 


713 


2063 


A 


5506 


22 


478 


VBELEL V SRLDPHLHTPMYFFLAHLSFLDLSFT 

TSS1PQLLYNLNGCDKTISYMGCAIQLFLFLGL 

GGVECLLLAVMAYDRCVAICKPLHYMVTMN 

PRLCRGLVSVTWGCGVANSLAMSPVTLRLFR 

CGHHEVDHFLCEMPALIRMACISTV 


714 


2064 


A 


5514 


25 


220 


AIRPYWCENNI1GIGKXSTADGKAFADPEVLR 
RLTSS VSCALDE AAAALTRMRAESTAN AGQ S 
DK 


715 


2065 


A 


5526 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAM 

GRTALFHHSGGSSGYESLRRDSEATGSASSAP 

D SM SESG AASPG ARTRSLKSPfCKRATGLQRR 

RLrPAPLPDTTALGRKPSLPGQWVDLPPPLAG 

SLKEPFEIKVYEIDDVERLQRPRPTPREAPTQG 

LACVSTRLRLAERRQQRLREVQAKHKHLCEE 

LAETQGRLMLEPGRWLEQFEVDPELEPESAE 

YLAALERATAALEQCVNLCKAHVMMVTCFD 

ISVAASAAIPGPQEVDV 


716 


2066 


A 


5529 


458 


790 


SPGYGENKFTVTSXNIAVPLCEMNKJYSYYSD 
SSSSERTMDLVLEMCNTNSIHWCGISGRQLG 
KLHPSSSLCLALTLLSSVQGLQSISGLRLTDTF 
LKRTYEYDDIAQVCV 


717 


2067 


A 


5531 


3 


460 


NSEDLLK.YFNPESWQEDLDNMYLDTPRYRG 
RSYHDRKSKVDLDRLNDDAKRYSCTPRNYS 
VNIREELKLANVVFFPRCLLVQRCGGNCGCG 
TVNWRSCTCN SGKT VKKYHE VLQFEPGHIKR 
RGRAKTMALVDIQLDHHERCDCICSSRPPR 


718 


2068 


A 


5586 


311 


88 


AVLKNMAPMTALGLLDLrflLNLR.FLSAGEDF 

TSWSEIMMYTLLVFLTLWLLIEMIYCYRKVS 

KAEEAAQENA 


719 


2069 


A 


5598 


1 


330 


KN C ANE A WQ KIL DRVLSR YD VRLRPNF GSM 
LATNSTRGLNEDELMAHGQEKDSSSESEDSC 
PPSPGCSFTEGFSFDLLNPDYVPKVDKWSRFL 
FPLAFGLFNIVAAERC 


720 


2070 


A 


5628 


798 


148 


LPPAQrPEAWLLLANV\WLILVPLKDRLIDP 

LLLRCKLLPSALQKMALGMFFGFTSVIVAGV 

LEMERLHYIHHNETVSQQIGEVLYNAAPLSIW 

WQIPQYLLIGISEIFASIPGLEFAYSEAPRSMQG 

AIMGIFFCLSGVGSLLGSSLVALLSLPGGWLH 

CPKDFGNINNCRMDLYFFLLAGIQAVTALLF 

VWIAGRYERASQGPASHSRFSRDRG 


721 


2071 


A 


5632 


146 


536 


MSALIVRXLR5AELTLFSELPTVLGANVNAA 
rCLHETALHHAAKVKKV'DLIEMLIEFGGNrYA 
RDNRGKKPSD YTWS S S APAKCFEYYEKTPLT 
LSQLCRVNLRJCATGVRGLEKIAKLN1PPRLID 
YLSYN 


722 


2072 


A 


5638 


3 


3806 


CPSLDIRSEVAELRQLENCSWEGHLQILLMF 

TATGEDFRGLSFPRLTQVTDYLLLFRVYGLES 

LRDLFPNLAVIRGTRLFLGYALVIFEMPHLRD 

VALPALG A VLRGA VR VEKNQELCHL STTDW 

GLLQPAPGANHTVGNKLGEECADVCPGVLGA 

AGEPCAKTTFSGHTDYRCWTSSHCQRVCPCP 

HGNtACTARGECCFTrECLGGCSQPEDPRACV 

ACRHLYFQGACLWACPPGTYQYESWRCVTA 

ERCASLHSVPGRASTFGrHQGSCLAQCPSGFT 

RNSSSIFCHKCEGLCPKECK VGTKUD SIQ AA 
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Amino acid sequence (A= Alanine C=Cysteine, 
D^Aspartic Acid, E— Glutamic Acid, 
F~PhenykIaninc, G-Glycine, H^Histidine, 
I=Isoleucine, K=Lystne, L=Leucine > 
M=Methionine, N=Asparagine, P=Proline, 
Q^lutamine, R^Arginine, S^Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y~Tyrosine, X=Unknown, *»Stop codon, 
A^possible nucleotide deletion, V=possiblc 
nucleotide insertion 














QDLVGCTHVEGSLILNLRQGYNLEPQLQHSL 

GLVETITGFLKIKHSFALVSLGFFKNLKLIRGD 

AMVDGNYTLYVLDNQNLQQLGSWVAAGLTI 

PVGKIYFAFNPRLCLEHIYRLEEVTGTRGRQN 

KAEINPRTNGDRAACQTRTLRFVSNVTEADRI 

LLRWERYEPLEARDLLSFIVYYKESPFQNATE 

HVGPDACGTQSWNLLDVELPLSRTQEPGVTL 

ASLKPWTQYAVFVRAITLTTEEDSPHQGAQS 

PIVYLRTLPAAPTVPQDVISTSNSSSHLLVRW 

KPPTQRNGNLTYYLVLWQRLAEDGDLYLND 

YCHRGLRLPTSNNDPRFDGEDGDPE AEME SD 

CCPCQHPPPGQVLPPLEAQEASFQKKFENFLH 

NAITIPISPWKVTSINKSPQRDSGRHRRAAGPL 

RLGGNSSDFEIQEDKVPRERAVLSGLRHFTEY 

RIDIHACNHAAHTVG CS AATFVF ARTMPHRE 

ADGIPGKVAWEASSKNSVLLRWLEPPDPNGL 

ILKYEDCYRRLGEEATVLCVSRLRYAKFGGV 

HLALLPPGNYSARVRATSLAGNGSWTDSVAF 

YILGPEEEDAGGLHVLLTATPVGLTLLIVLAA 

LGFFYGKKRNRTLYASVNPEYFSASDMYVPD 

E WE VPREOIS IIREL GOGSFGMV YEGLARGLE 

AGEESTPVALKTVNELASPRECIEFLKEASVM 

KAFKCHHVVRLLGVVSQGQPTLV1MELMTR 

GDLKSHLRS LRPE AENNPGLPQPALGEMIQM 

AGE1ADGMAYLAANKFVHRDLAARKCMVSQ 

DFTVKJGD FGMTRD VYETD YYRKGGKGLLP 

VRWMAPESLKDGIFTTHSDVWSFGVVLWEIV 

TLAEQPYQGLSNEQVLKFVMDGGVLEELEGC 

PLQLQELMSRCWQPNPRLRPSFTHILDSIQEEL 

RPSFRLLSFYYSPECRGARGSLPTTDAEPDSSP 

TPRDCSPQNGGPGH 


723 


2073 


A 


5672 


1 


216 


LAW LDN I LPEKEKKETD KKRKRKKG AHEDCD 

EEPQFPPPSVIKIPMESVQSDPQNGIHCIARKR 

SSSWSYSL 


724 


2074 


A 


5704 


4235 


940 


ARGRRSRFVWAASWGGRGRPAARRRPRGLA 

ATMGFELDRFDGDVDPDLKCALCHKVLEDP 

LTTPCGHVFCAGCVLPWVVQEGSCPARCRGR 

LSAKELNHVLPLFCRLILKLD1KCAYATRGCGR 

VVKLQQLPEHLERCDFAPARCRHAGCGQVLL 

RRDVEAHMRDACDARPVGRCQEGCGLPLTH 

GEQRAGGHCCARALRAHNGALQARLGALHK 

ALKXEALRAGKREKSLVAQLAAAQLELQMT 

ALRYQKKFTEYSARLDSLSRCVAAPPGGKGE 

ETKSLTLVLHRDSGSLGFNnGGRPSVDNHDG 

SSSEGIFVSKIVDSGPAAKEGGLQIHDRIIEVN 

GRDLSRATHDQAVEAFKTAKEPrVVQVLRRT 

PRTKMFTPPSESQLVDTGTQTDITFEHIMALT 

KMSSPSPPVLDPYLLPEEHPSAHEYYDPNDYI 

GDIHQEMDREELELEEVDLYRMNSQDKLGLT 

VCYRTDDEDDIGIYISEIDPNSIAAKDGRIREG 

DRIIQINGIEVQNREEAVALLTSEENKNFSLLI 

ARAELQLDEGWMDDDRNDFLDDLHMDMLE 

EQHHQAMQFTASVLQQKKHDEDGGTTDTAT 

ILSNQHEKDSGVGRTDESTRNDESSEQENNG 

DDATASSNPLAGQRKLTCSQDTLGSGDLPFS 

NESFISADCTDADYLGIPVDECERFRELLELK 

CQVKSATPYGLYYPSGPLDAGKSDPESVDKE 

LELLNEELRSIELECLSIVRAHKMQQLKEQYR 

ES WMLHNS GFRN YNTS ID VRRHELSDITELPE 

KSDKDSSSAYNTGESCRSTPLTLEISPDNSLRR 
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ng to first 
amino acid 
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to last amino 
acid residue 
of peptide 
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Amino acid sequence (A=AJanine C=Cysteine, 
D^Aspartic Acid, E*=Glutamic Acid, 
F=Phcnylalaninc, OOlycinc, H-Histidine, 
I=Isoleucine, K=Lysine, L= Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q/=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y*-Tyrosinc, X-Unknown, *-Stop codon, 
A^possiblc nucleotide deletion, \=possiblc 
nucleotide insertion 














AAEG[SCPSSEGAVGTTEAYGPASKNLLSITE 

DPEVGTPTYSPSLKELDPNQPLESKERRASDG 

SRSPTPSQKLGSAYLPSYHHSPYKHAHIPAI IA 

QHYQSYMQLIQQKSAVEYAQSQMSLVSMCK 

DLSSPTPSEPRMEWKVKIRSDGTRYITKRPVR 

DRLLRERALKIREERSGMTTDDDAVSEMKM 

GRYWSKEEEKQHLVKAKEQRRRREFMMQSR 

LDCLKEQQAADDRKEMNILELSHKKMMKKR 

NKKIFDNWMT1QELLTHGTKSPDGTRVYNSF 

LSVTTV 


725 


2075 


A 


5707 


3 


1770 


QISTEVSEAPVANDKPKTLWKVQKKAADLP 

DRDTWKGRFDFLMSCVGYAIGLGNVWRFPY 

LCGKNGGGAFLIPYTLTLIFAGVPLFLLECSLG 

QYTSIGGLGVWICLAPMFKGVGLAAAVLSFW 

LNIYYIV1ISWAIYYLYNSFTTTLPWKQCDNP 

WNTDRCFSN Y SM VNTTNMTS A V VEF W ERN 

MHQMTDGLDKPGQIRWPLAITLAIAWILVYF 

CIWKGVGWTGKVVYFSATYPY1MLIILFFRGV 

TLPGAKEGILFYITPNFRKLSDSEVWLDAATQ 

IFF S YGLGLGSLLALG S YNSFHNNVYRDS II VC 

CINSCTSMFAGFVTFSIVGFMAHVTKRSIADV 

AASGPGLAFLAYPEAVTQLPISPLWAILFFSM 

LLMLGIDSQFCTVEGFrTALVDEYPRLLRNRR 

ELFIAA VCII S YLI GLSN1TQGGI Y VFKLFD YYS 

ASGMSLLFLVFFECVSISWFYGVNRFYDNIQE 

MVGSRPCIWWKLCWSFFTPIIVAGVFIFSAVQ 

MTPLTMGNYVFPKWGQGVGWLMALSSMVL 

IPG YMA YMFLTLKG S LKQRIQ VM VQPSEDIV 

RPENGPEQPQAGSSTSKEAYI 


726 


2076 


A 


5711 


156 


423 


PRRDPGRTPELRGSAPRKTGANMPVRRGHVA 

PQNTFLGTIIRKFEGQNICKFUANARVQNCAII 

YCNDGFCEMTGFSRPDVMQKPCTCD 


727 


2077 


A 


5716 


3 


274 


FIASE YFFKL C SFQVFL5FPL ATFVID V GL WIP 
LVKSPNVHYVYVLLLVLSGLLFYIPLIHFKIRL 
AWFEKMTCYLQLLFNICLPDVSEE 


728 


2078 


A 


5737 


1899 


649 


IQASRASPYPRVKVDFALSCHEDLLAPISEPIE 

WK YHSPEEEISLGPACWLWDFLRRS QQAGFL 

LPLSGGVDSAATACLIYSMCCQVCEAVRSGN 

EEVLADVRTrVNQISYTPQDPRDLCGRILTTC 

YMASKNSSQETCrrRARELAQQIGSHHISLNID 

PAVKAVMG1FSLVTGKSPLFAAHGGSSRENL 

ALQNVQARIRMVLAYLFAQLSLWSRGVHGG 

LLVLGS ANVDESLLG YLTK YDCS S ADINPIGG 

ISKTDLRAFVQFCIQRFQLPALQSILLAPATAE 

LEPIJVDGQVSQTDEEDMGMTYAELSVYGKL 

RKVAKMGPYSMFCKJLLGMWRH1CTPRQVAD 

KVKRFFSKYS^^^^RHKMTTLTPAYHAEKYSPE 

DNRFDLRPFLYNTSWPWQFRCIENQVLQLER 

AEPQSLDGVD 


729 


2079 


A 


5741 


1 


5976 


PGCAARLSRARAPGPGAAGAGRKRLADPGPP 

P ASRRLRAPG SRPRLAPCTRRAAQP AHARMA 

PRAAGGAPLSARAAAASPPPFQTPPRCPVPLL 

LLIXLGAARAGALEIQRRFPSPTPTNNFALDG 

AAGTVYLAAVNRLYQLSGANLSLEAEAAVG 

PVPDSPLCHAPQLPQASCEHPRRLTDNYNKIL 

QLDPGQGLVWCGSIYQGFCQLRRRGNISAV 

AVRFPPAAPPAEPVTVFPSMLNVAANHPNAS 

TVGLVLPPAAGAGGSRLLVGATYTGYGSSFE 

PRNRSLEDHRFENTPEIAIRSLDTRGDLAKLFT 
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Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G-Glycine, H-Histidine, 
I^Iso leucine, K=Lysine, L— Leucine, 
M=Methionine, N=Asparaginc, P=Proline, 
Q=<j]utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptcphan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, \-possiblc 
nucleotide insertion 














FDLNPSDDNILKIKQGAKEQHKLGFVSAFLHP 

SDPPPGAQSYAYLALNSEARAGDKESQARSL 

LARICLPHGAGGDAKKLTESYIQLGLQCAGG 

AGRGDLYSRLVSVFPARERLFAVFERPQGSPA 

ARAAPAALCAFRFADVRAAIRAARTACFVEP 

APDWAVLDSWQGTGPACERKLNIQLQPEQ 

LDCGAAHLQHPLS1LQPLKATPVFRAPGLTSV 

AVASVNNYTAWLGTVNGRLLKINLNESMQ 

WSRRVVTVAYGEPVHHVMQFDPADSGYLY 

LMTSHQMARVKVAACNVHSTCGDCVGAAD 

AYCGWCALETRCTLQQDCTNSSQQHFWTSA 

SEGPSRCPAMTVLPSETDVRQEYPGMILQISGS 

LPSLSGMEMACDYGNNIRTVARVPGPAFGHQ 

IAYCNLLPRDQFPPFPPNQDHVTVEMSVRVN 

GRNrVKANFTTYDCSRTAQVYPHTACTSCLSA 

QWPCFWCSQQHSCVSNQSRCEASPNPTSPQD 

CPRTLLSPLAPVPTGGSQNILVPLANTAFFQG 

AALECSFGLEEIFEAVWVNESVVRCDQWLH 

TTRKSQVFPLSLQLKGRPARFLDSPEPMTVM 

VYNC AMGSPDC SQCLGREDLGHLCM WSDGC 

RLRGPLQPMAGTCPAPEIRAIEPLSGPLDGGT 

LLTIRGRNLGRRLSDVAHGVWIGGVACEPLP 

DRYTVS EEI VC VTGPAPGPLSG WTVN ASKE 

GKSRDRFSYVLPLVHSLEPTMGPKAGGTRITI 

HGNDLHVGSELQVLVNDTDPCTELMRTDTSI 

ACTMPEGALPAPVPVCVRFERRGCVHGNLTF 

WYMQNPVITAJSPRRSPVSGGRTITVAGERFH 

MVQNVSMAVHHIGREPTLCKVLNSTLITCPSP 

GALSNASAPVDFFINGRAYADEVAVAEELLD 

PEEAQRG SRFRLD YLPNPQFSTAKREKWIKH 

HPGEPLTLVIHVSTKGAGKEQDSLGLQSHEY 

RVKIGQVSCDIQIVSDRIIHCSVNESLGAAVGQ 

LPITIQVGNFNQTIATLQLGGSETAIIVSIVICSV 

LLLLSVVALFVFCTKSRRAERYWQKTLLQME 

EMESQIREEIRKGFAELQTDMTDLTKELNRSQ 

GIPFLEYKHFVTRTFFPKCSSLYEERYVLPSQT 

LNSQGS SQ AQETHPLLGE WKIPESCRPNMEE 

GISLFSSLLDNKHFLIVFVHALEQQKDFAVRD 

RCSLASLLTIALHGKLEYYTSIMKELLVDLID 

ASAAKNPKLMLRRTESVVEKML'TNTVMSrCM 

YSCLRETVGEPFFLLLCADCQQINXGSIDAITG 

KARYTLNEEWLUCENEAI^RNUvrVSFQGCG 

MDSLSVRAMDTDTLTQVKEKILEAFCKNVPY 

SQWPRAED VDLEWF AS STQSYILRDLDDTS V 

VEDGRKKLNTLAHYKIPEGASLAMSLIDKKD 

NTLGRVKDLDTEKYFHLVLPTDELAEPKKSH 

RQSHRKKVLPEIYLTRLLSTKGTLQKFLDDLF 

KAILS IREDKPPLAVKYFFDFLEEQAEKRGISD 

PDTLHIWKTNSLPLRFWVNILKNPQFVFDiDK 

TDHIDACLSVIAQAFIDACSISDLQLGKDSPTN 

KLLYAKEIPEYRKrVQRYYKLQIQDMTPLSEQE 

MNAHLAEESRKYQNEFNTNVAMAEIYKYAK 

RYRPQ1MAALEANPTARRTQLQHKFEQWAL 

MEDNIYECYSEA 


730 


2080 


A 


5744 


3 


292 


QPSPLFHSHLETLQLLRTAQLPEQVSWPWGQ 
VANGKGNQRNMGSPQPSLLAFERNLELQrMG 
LGYSLLMGKLRPRVAKDTLRVHRDSTPSPLT 
LKD 


731 


2081 


A 


5747 


1 


382 


FLKCMRKAFRSSK LLQVGYTPDGKDDY RWC 
FRVDEVNAVTTWNTNVGIINEDPGNCEGVKRT 
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D^Aspartic Acid, E=Glutamic Acid, 

F=Phenvlalnnine fi=(~tlvrinp H=Hi^tif1inp 

I=Isoleucine, K=Lysine, L=Leucine, 
M=Meth ion i n e, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S= Serine, 
T=Threonine J V=Valine, W-Tryptophan, 
Y="Tyrosine, X— Unknown, *™Stop codon, 
/=possible nucleotide deletion, V*possible 
nucleotide insertion 














LSFSLJR.SSJR.VSGRHWKNFALVPLLREASARD 

RQSAQPEEVYLRQFSGSLKPEDAEVFKSPAAS 

GEK 


732 


2082 


A 


5753 


198 


3 


AQAESSTVASPEATAGPLCTRIPNVPPPTPIRP 
PGKiQAQLPCPSPVRFTSARIPPASRPQTKS 


733 


2083 


A 


5754 


2 


2223 


AAGPPGLEAEGRAPESAGPGPGGDAAETPGL 

PPAHSGTLMMAFRDVTVQIANQNISVSSSTAL 

S V AN CLG AQTVQAPAEPAAGKAEQGETS GR 

EAPEAPAVGREDASAEDSCAEAGASGAADG 

ATAPKTEEEEEEEETAEVGRGAEAEAGDLEQ 

LNRTSTSTKSAKSGSEASASASKDALQAMILS 

LPRYHCENPASCKSPTLSTDTLRKRLYRIGLN 

LFNTNPDKGIQFLISRGFIPDTPIGVAHFLLQRK 

GLSRQMIOEFLGNSKKQFhIRBVLDCVVDEM 

DFSSMELDEALRKFQAHTR.VQGEAQKVERLIE 

AFSQRYCMCNPEVVQQFHNTPDTIFELAFAIILL 

NTDMYSPNIKPDRKMMLEDFIRNLRGVDDG 

ADIPRELVVGIYERIQQKELKSNEDHVTYVTK 

Vh,KblV(jMKl VLb VrhlKKLVC(Jv>Ki-rtV 1DV 

NKLQKQAAHQREVFLFNDLLVILKLCPKKKS 

SSTYTFCKSVGLLGMQFQLFENEYYSHGITLV 

TPLSGSEKKQVLHFCALGSDEMQKFVEDLKE 

CIACI/TCI T?rMTP TT7 WT?T nVri/VlTVTI QVX/"DfTl A 

oJAJt v 1 tLct^IKlc. W b,L,EJf>*K£\jyj i -IS. 1 i^or ivr L.OA 

QGDPQSKOGSPTAKREAALRERPAESTVEVSI 

HNRLQTSQHNSGLGAERGAPVPPPDLQPSPPR 

QQTPPLPPPPPTPPGTLVQCQQIVKVIVLDKPC 

LARMEPLLSQALSCYTSSSSDSCGSTPLGGPG 

SPVKVTHQPPLPPPPPPYNHPHQFCPPGSLLH 

GHRYSSGSRSLV 


734 


2084 


A 


5788 


8 


362 


SSVMGDLVGQGLEEQIVARDENSWLIDGGTP 
rDDVMRVLDIDEFPQSGNYETIGGFMMFMLR 
KJPKRTDSVKFAGYKFEWDIDNYRIDQLLVT 
RID SKAT ALSPKLPDAKDKEESVA 


735 


2085 


A 


5827 


1 


1257 


M VFS AVLTAFH 1G rSNTTF V V YENTYMN1TL 
PPPFQHPDl^PLLRYSFETMAPTGLSSLTVNST 

NLVVCI .MVYQKAAMRSAINILLASLAFADM 

LLAVLNMPFALVTILTTRWIFGKFFCRVSAMF 

FWLFVlEGVAri.LIISrDRFLHVQRQDKLNPYR 

AKVLIAVSWATSFCVAFPLAVGNPDLQIPSRA 

PQCVFGYTTNPGYQAYVrLrSLrSFFTPFLVILY 

SFMGILNTLRHNALRIHSYPEGICLSQASKLGL 

MGLQRPFQMSIDMGFKTRAFTTILILFAVFIVC 

W APFTTY SL VATFSKHF YYQHNFFEISTWLL 

WLCY LX SALN PL J Y Y WRIKKFH DA CLDMMP 

KSFK>'LPQLPGHTKRRJRPSAVYVCGFJ-IRrW 


736 


2086 


A 


5870 


3 


268 


FTRSDELARHYRTHTGEKRFSCPLCPKQFSRS 
DHLTKHARRHPTYHPDMIEYRGRRRTPRIDPP 
LTSE VE SS ASGSGPGPAPSFTTCL 


737 


2087 


A 


5871 


2 


521 


LTWPQU^TLPELLHMSRPAEDGPSPGALVR 

RSSSLGY1SKAEEYFLLKSRSDLMFEKQSERH 

GLARRLTT ARRPPAS SEQAQQELFNELKPAV 

DGANFIVNHMRDQNNYNEEKDSWNRVART 

VDRLCLFVVTPVMVVGTAWrFLQGVYNQPPP 

QPFPGDPYSYNVQDKRFI 


738 


2088 


A 


5881 


1 


1160 


LVVTAITAILAFPNEYTRMSTSELISELFNDCG 
LLDSSKJLCDYENRFNTSKGGELPDRPAGVGV 
Y S AM WQLALTLILKIV1TIFTFGMKIPSGLFIPS 
MAVGA1AGRLLGVGMEQLAYYHQEWTVFNS 
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correspond! 

ng to first 
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peptide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenyLalanine, G=Glycine, H=Histidine, 
I-lsoteucine, K=Lysine, L-Leucine, 
M=Methionkte, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaJine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














WCSOGADCITPGLYAMVGAAACLGGVTRMT 

VSLVVIMFELTGGLEY1VPLMAAAMTSKWVA 

DALGREGIYDAHIRLNGYPFLEAKEEFAHKTL 

A MD VMXPRRNDPL LTVLTQD S M TVED VET1 1 

SETTYSGFP VVV SRESQRL VGFVLRRDLI1 S1E 

NARKKQDGVVSTSITYFTEHSPPLPPYTPPTLK 

LRN1LDLSPFTVTDLTPMEIWDIFRKLGLRQC 

LVTHNGRLLGIITKKDVLKHIAQMANQDPDSI 

LFN 


739 


2089 


A 


5892 


2 


916 


TLQLAASVPFFAISLISWWLPESARWLIINGKP 

DQALQELRKVARJNGHKEAKNLTIEVLMSSV 

KEEVASAKEPRSVLDLFCVPVLRWRSCAMLV 

VNFSLUSYYGLVFDLQSLGRDIFLLQALFGA 

VDFLGRATTALLLSFLGRRTIQAGSQAMAGL 

AILANMLVPQDLQTLRVVFAVLGKGCFGISL 

TCLTrYKAELFPTPVRMTADGILHTVGRLGA 

MMGPL1LMSRQALPLLPPLLYGVISIASSLWL 

FFLPETQGLPLPDT1QDLESQICSTAAQGNRQE 

AFTVESTSLLEIVALHGAL 


740 


2090 


A 


5900 


2 


426 


RPIKTLGIGFHFSVDGVHFLTQREVQNLWKE 
NLI1LDTAKKHGYEWDTFTITMGRYKEFLQG 
KCGCHFHEVVKSKLSKEYNFIKMKRSRNHIM 
GRYFSNQSKLQQGTVTNFRSPYHVRGPINQV 
CSEII .LSRMCANKRTM 


741 


2091 


A 


5910 


3 


412 


RMPESTLLIICENGYILEAPLPTIKQEEDDHDV 
VS YEIKDMCIKCFHF S S VK.SKILRLIEIEKRER 
QRELKEKIREERRNKLAAEMGEDGEKEFQEE 
EEEKEEEEEEEEPLPEIF1PSTPSPILCGFYSEPG 
KFWV 


742 


2092 


A 


5936 


1 


482 


MGCRLLCCWFCLLQAGPLDTAVSQTPKYLV 

T QM GNDK S IKCE QNL GHDTMYW YK QD SKK. 

FLKJMFSYNNKELIINETVPNRFSPKSPDKAHL 

NLHINSLELGDSAVYFCASSQDTALQSHCIPV 

HKPPGSARKLQGSVCTCTQOSSLHSLMASDG 

VPVC 


743 


2093 


A 


5938 


I 


1566 


MNSFFGTPAASWCLLESDVSSAPDKEAGRER 

RALSVQQRGGPAWSGSLEWSRQSAGDRRRL 

GLSRQTAKSSWSRSRDRTCCCRRAWWILVPA 

ADRARRERFINfNEKWDTNSSENWHPIWNVN 

DTKHHLYSDINITYVNYYLHQPQVAAIFIISYF 

LIFFLCMMGNTVVCFrVMRNKFIMHTVTNLFI 

LNLAISDLLVGIFCMPITLLDNILAGWPFGNTM 

CKISGLVQGISVAASVFTLVAIAVDRFQCWY 

PFKPKLTIKTAFVIIMIIWVLAITIMSPSAVMLH 

VQEEKYYRVRLNSQNKTSPVYWCREDWPNQ 

EMRKIYTTVLFANIYLAPLSLIVIMYGRIGISLF 

RAAVPHTGRKNQEQWHVVSRKKQKIIKMLLI 

VALLFILSWLPLWTLMMLSDYADLSPNELQII 

NIYIYPFAHWLAFGNSSVNPirYGFFNENFRRG 

FQEAFQLQLCQKRAKPMEAYALKAKSIIVLIN 

TSNQLVQESTFQNPHGE71XYRKSAJBKPQQE 

LVMEELKETTNSSEI 


744 


2094 


A 


5966 


149 


327 


SHVCVSHYAGSSGCPAGAGAGAVALGISAVA 
LYDYQGGRLGVARGAWYMEAPDIRQGDM 


745 


2095 


A 


5970 


413 


856 


GAPHTDWAWAPTPMSGLGSGRGRQGTLASS 
PLSLPLLLAGVTGILATELFDQMARPAACMV 
CGALMWIMLILVGLGFPFIMEALSHFLYVPFL 
GVCVCGAJYTGLFLPbTK.GK'l"FQEISK£LHRL 
NFPRRAQ GPTWRSL E V] QSTEL 
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746 


2096 


A 


5971 


3 


1343 


AQTARRIIGLELDTEGHRLFVAFSGCIVYLPLS 

RC ARHG ACQRSCLASQDPYCG WHS SRGC VD1 

RGSGGTDVDQAGNQESMEHGDCQDGATGSQ 

SGPGDSAYGVRRDLPPASASRSVPIPLLLASV 

AAAFALGASVSGLLVSCACRRAHRRRGKD1E 

TPOLPRPLSLRSLARLHGGGPEPPPPSKDGDA 

VQTPQLYTTFLPPPEGVPPPEL ACLPTPE STTE 

LPVKHLRAAGDPWEWNQNRNNAKEGPGRSR 

GGHAAGGPAPRVLVRPPPPGCPGQAVEVTTL 

EELLRYLHGPQPPRKGAEPPAPLTSRALPPEP 

APALLGGPSPRPHECASPLRLDVPPEGRCASA 

PARPALSAPAPRLGVGGGRRLPFSGHRAPPAL 

LTRVPSGGPSRYSGGPGKHLLYLGRPEGYRG 

RALKRVDVEKPQLSLKPPLVG PSSRQA VPNG 

GRFNF 


747 


2097 


A 


5998 


2 


754 


DHASLPCSWNHRFDVETRHVFIGDHSGQVTI 

LKLEQENCTLVTTFRGHTGGVTALCWDPVQ 

RVLFSGSSDHSVIMWDIGGRKGTAIELQGHN 

DRVQALSYAQHTRQLISCGGDGGIVVWNMD 

VERQETPEWLD SDSCQKCDQPFFWNFKQM W 

DSKKIGLRQHHCRKCGKAVCGKCSSKRSS1PL 

MGFEFEVRVCDSCIIEAITDEERAFTATFHDSK 

HN1 VHVHFD ATRG WLLTSGTDK V7KL WDMT 

PVVS 


748 


2098 


A 


6001 


2 


747 


AMVFGGWPYVPQYRDIRRTQNADGFSTYV 
CLVLLVANTLRILFWFGRRFESPLLWQSAIMIL 
TNrLLMLKLCTEVRVANELNARRRSFTAADS 
KDEE VK VAPRRSFLDFDPHHF WQ WSSFSD Y V 
QC VLAFTGV AGYITYL SIDS AL F VETLG FL A V 
LTEAMLGVPQLYRNHRHQSTEG.MSIKMVLM 
WTSGDAFKTA YFIXKGAPLQFS VCGLL Q VL V 
DLAILGQAYAFARHPQKPAPHAVHPTGTKAL 


749 


2099 


A 


6002 


2 


447 


GRPDRSELVRMHILEETFAEPSLQATjQMKLK 
RARL ADDLNEKIAQRPGPMEL VEKNILPVD S S 
VKEAIIGVGKEDYPHTQGDFSFDEDSSDALSP 
LX^PASQESQGSAASPSEPKVSESPSPVTTNTP 
AQFAS VSPTVPEFLK 1PPTAD 


750 


2100 


A 


6004 


2 


427 


LLTQAMLVLPFIRPQWFTPGPRLQAQGPCQEG 

ft^WELRLRNYVPEDEDLNKRRVPQAKPDAV 

QEKVKEQLEAAKPEPVIEEVDLAKLAPRKPD 

WDLKRDVAKKLEKLLKRTQRAIAELIRERLK 

GQEDSLDSAVDAATEHKTC 


751 


2101 


A 


6007 


33 


1280 


TDQAKVDNQPEKLVRSAEDVSTVPTQPDNPF 

SHPDKLKRMSKSVPAFLQDESDDRETDTASE 

SSYQLSRHKKSPSSLTNLSSSSGMTSLSSVSGS 

VMSVYSGDFGNLEVKGN1QFAIEYVESLKEL 

HVFVAQCKDLAAADVKKQRSDPYVKAYLLP 

DKGKMGKKKTLWKKTLNPVYNE1LRYKIEK 

QILKTQKLNLSIWHRDTFKRNSFLGEVELDLE 

TWDWDNKQNKQLRWYPLKRKTAPVALEAE 

NRGEMKLALQYVPEPVPGKKLPTTGEVHIWV 

KECLDLPLLRGSHLNSFVKCTILPDTSRKSRQ 

KTRAVGKTTNPJFNHTMVYDGFRPEDLMEAC 

VELTVWDHYKLTNQFLGGLRIGFGTGK S YGT 

EVDWMDSTSEEVALWEKMVNSPNTWIEATL 

PLRMLLIAKISK 


752 


2102 


A 


6028 


108 


1283 


KEIFSPFELIS VKPLCLLLGVTC SQSMAFEELL 
SQ VCjULOKi* QMLHL. Vr lLr SJLMi-LLrtiLLLbNr 
AAAIPGHRCWVHMLDNNTGSGNETGILSEDA 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LLR1SIPLDSNLRPEKCRRFVHPQWQLLHLNG 

TIHSTSEADTEPCVDGWVYDQSYFPSTIVTKW 

DLVCDYQSLKS WQFLLLTGMLVGGIIGGI IV 

SDRFGRRHLRWGLLQLAITDTCAAFAPTFPV 

YCVT.RFLAGFSSMinSNNSLPITEWIRPNSKAl. 

WILSSGALNIGQIILGGLAYVFRDWQTLHVV 

ASVPFFVFFLLSRWLVESARWLrrTNKLDEGL 

KALRKVARTNGIKNAEETLNIEWRSTMQEE 

LDAAQTKTTVWDLFRNPSMRKRICILVFLRK 

KNLKEKA 


753 


2103 


A 


6043 


1 


1470 


DSFESILRLIFE1HHSGEKGDIWFLACEQDEEK 

VCETVYQGSNLNPDLGELVVVPLYPKEKCSL 

FKPLDETEKRCQVYQRRWLTTSSGEFLIWSN 

SVRFVIDVGVERRKVYNPRIRANSLVMQPISQ 

SQAEIRKQILGSSSSGKFFCLYTEEFASKDMTP 

LKPAEMQEANLTSN1VLFMKRIDIAGLGHCDF 

MNRPAPESLMQALEDLDYLAALDNDGNLSE 

FGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

AM VTAPNCFS HVPHG AEEAALTC WKTFLHPE 

GDHFTLISIYKAYQDTTLNSSSEYCVEKWCRD 

YFLNCSALRMADVIRAELLEIIKRIELPYAEPA 

FGSKENTLNI KKALLS G YFMQI ARD VDGSGN 

YLMLTHKQVAQLHPLSGYSrrKKMPEWVLF 

HKFSISENNYIR1TSEISPELFMQLVPQYYFSNL 

PP S E S KDIL QQ WDHLSP VSTMNKEQ QMCET 

CPETEQRCTLQ 


754 


2104 


A 


6055 


2 


394 


YYALHHWPFPDLLCQTTGAIFQMNMYGSCIF 
I ,MLTNVDRYAAIVHPLRLRHLRRPRVARLLC 
LGVWALILVFAVPAARVHRPSRCRYRDLEVR 
LCFE SFSDEL WKGRLLPLVLLAEALGFLLPL A 
AVVYSS 


755 


2105 


A 


6059 


3 


1795 


LGLGSGTLLSVSEYKKKYREHVLQLHARVKE 

RNARSVKITKRFTKLLIAPESAAPEEALGPAEE 

PEPGRARRSDTHTFNRLFRRDEEGRRPLTVVL 

QGPAGIGKTMAAKK1LYDWAAGKLYQGQVD 

FAFFMPCGELLERPGTRSLADLILDQCPDRGA 

PVPQMLAQPQRLLFILDGADELPALGGPEAAP 

CTDPFEAASGARVLGGLLSKALLPTALLLVTT 

RAAAPGRLOGRLCSPQCAEVRGFSDKDKKK 

YFV-KFFRDERRAERAYRFVKJENETLFALCFV 

PFVCWrVCrVLRQQLELGRDLSRTSKTTTSVY 

LLFITS VLSS APV ADGPRLQGDLRNLCRLARE 

GVLGRRAQFAEKELEQLELRGSKVQTLFLSK 

KELPGVLETEVTYQFIDQSFQEFLAALSYLLE 

DGG VPRTAAGGVGTLLRGD AQPH SHL VLTT 

RFLFGLLSAERMRDIERHFGCMVSERVKQEA 

LRWVQGQGQGCPGVAPEVTEGAKGLEDTEE 

PEEEEEGEEPNYPLELLYCLYETQEDAFVRQA 

LCRFPELALQRVRFCRMDVAVLSYCVRCCPA 

GQALRLISCRLVAAQEKKKKSLGKRLQASLG 

GG 


756 


210o 


A 


6060 


12 


436 


SGRPTRPAKFIXJQGMGRFMLTLVCQGSIMMS 
ARDLIMNNLTELQPGLFHHLRFLEELRLSGNH 
LSHIPXjQAFSGLYSLKILMLHNNQLGGIPAQA 
LWELPSLQSLRLDANLISLVPERSFEGLSSLRH 
LWLDDNALTEIPS 


757 


2107 


A 


6063 


54 


4iy 


ITPL GLG AADMC AFP WLLLLLLLQEGS QRRL 
WRWCGSEEWAVLQESISLPLEIPPDEEVENII 
WSSHKSLATVVPGKEGHPAT1MVTNPHYQG 
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Amino acid sequence (A= Alanine C=Cysteine, 
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QILTMLLRSLQQPSASWPRDCSSSCSW 


758 


2108 


A 


6066 


125 


438 


IGISCPATIFVPMFSH SL1GIG EEYQLPYYNMV 
PSDPSYEDMREWCVKRLRPIVSNRWNSDEC 
LRA VLKL MSEC W AHNP ASRLT ALRIKKTL AK 
MVESQDVKI 


759 


2109 


A 


6072 


3 


650 


PGRRFRPAALEERAMEKLREKVPFQNRGKGT 

L S SI IPNNS DTRKATETTS L S S KPEYVNPDFRW 

SKDPSSKSGNLLETSEVGWTSNPEELDP1RLA 

LLGKSGLSCQVGSATSHPVSCQEPIDEDQRISP 

KDKSTAGREFSGQVSHQTTSENQCTPIPSSTV 

HSSVADMQNMPAAVHALLTQPSLSAAPFAQ 

RYLGTLPSTGSTTLPQCHAGNATVW 


760 


2110 


A 


6077 


3 


730 


PLRLTLMEEVLLLGLKDREGYTSFWNDC1SSG 

LRGCML1ELPLRGRLQLEACGMRRKSLLTRK 

VICKSDAPTGDVLLDEALKHVKETQPPETVQ 

NWIELLSGETWNPLKLHYQLRNVRERLAKNL 

VEKGVLTTEKQNFLLFDMTTHPLTNNNIKQR 

LIKKVQEAVLDKWVKDPHRMDRRLLALIYL 

AHASDVLENAFAPLLDEQYDLATKRVRQLLD 

LDPE VECL KANTNE VL WA WAAFTK 


761 


2111 


A 


6078 


833 


390 


fVSFHLSGFKJCFVRPFSFLSVHGLQVDEYHSV 
HQKLS ADMADHSNLI RSLL VGAED ARLMRD 
MKTMKSRYMEL YDLNRDLLNG YKIR WNNH 
TELLGNLKAVNQAIQRAGRLRVGKPKNQVIT 
ACRD Al RSNNIN TLF KIMRV GT A S S 


762 


2112 


A 


6079 


2 


2686 


KKAITCGEKEKQDLIKSLAMLKDGFRTDRGS 

HSDLWSSSSSLESSSFPLPKQYLDVSSQTDISG 

S FG1NSNNQI . A EKVRLRLRYEE AKRRIANLKI 

QLAKLDSEAWPGVLDSERDRL1LTNEKEELLK 

EMRFISPRKWTQGEVEQLEMARKRLEKDLQ 

AARDTQ S KALTERLKLN SKRNQ L VRELE EAT 

RQVATLHSQLKSLSSSMQSLSSGSSPGSLTSSR 

GSLVASSLDSSTSASFTDLYYDPFEQLDSELQ 

SKVEFLLLEGATGFRPSGCITTIHEDEVAKTQ 

KAEGGGRLQALRSLSGTPKSMTSLSPRSSLSS 

PSPPCSPLMADPLLAGDAFLNSLEFEDPELSA 

TLCELS LGNS AQERYRLEEPGTEGKQLGQ AV 

NTAQGCGLKVACVSAAVSDESVAGDSGVYE 

ASVQRLGASEAAAFDSDESEAVGATRIQIAJLK 

YDEKNKQFAnJlQLSNl.SALLQQQDQK.VNIR 

VAVLPCSESTTCLFRTRPLDASDTLVFNEVFW 

VS MSYPALHQKTLRVDVCTTDRSHLEECLGG 

AQISLAE VCRSGER STR WYNLLS YK YLKKQS 

RELKPVG VMAP ASGP ASTD AV S ALLEQT AVE 

LEKRQEGRS STQTLEDS WRYEETSENE AVAE 

EEEEE V EEEEG E ED VFTEKA S PDMD G YP ALK 

VDKETNTETPAPSPTVVRPKDRRVGTPSQGPF 

LRGSTIIRSKTFSPGPQSQYVCRLNRSDSDSST 

LSK3CPPFVRNSLERRSVRMKRPSPPPQPSSVK 

SLRSERL1RTSLDLELDLQATRTWHSQLTQEIS 

VLKELKEQLEQAKSHGEKELPQWLREDERFR 

LLLRMLEKRMDRAEHMGELQTDKMMRAAA 

KDVHRLRGQSCKEPPEVQSFREKMAFFTRPR 

MNIPALSADDV 


763 


2113 


A 


6082 


3 


1558 


PHPIRFSKLCVSFNNQEYNQFCVIEEASKANE 
VLENLTQGKMCLVPGKTRKLLFKFVAKTED 
VGKKIEITSVDLALGNETGRCWLNWQGGGG 
DAASS QE ALQ AARSFKRRPKLPDNE VH WGS II 
IQASTMIISRVPNISVHLLHEPPALTNEMYCLV 
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VTVQSHEKTQIRD VKLTAGL KPGQD ANLTQK 

THVTLHGTELCDESYPALLTDIPVGDLHPGEQ 

LEKMLYVRCGTVGSRMFLVYVSYLrNTTVEE 

KEIVCKCrlKDETVTTETVFPFDVAVKFVSTKF 

EHLERVYADIPFLLMTDLLSASPWALTrVSSF 

LHLAPSMTTVDQLESQVDNVILQTGESASECF 

CLQCPSLGNTEGGVATGHY1ISWKRTSAMENI 

P1ITTVTTLPHVIVENIPLHVNADLPSFGRVRES 

LPVKYHLQNKTDLVQDVEISVEPSDAFMFSG 

LKQIRLRILPGTEQEMLYNFYPLMAGYQQLPS 

LNINLLRFPNFTNQLLRRFIPTSIFVKPQGRLM 

DDTSIAAA 


764 


2114 


A 


6093 


1 


1422 


AAADLANSNAGAAVGRKAGPRSPPSAPAPAP 

PPPAPAPPTLGNNHQESPGWRCCRPTLRERN 

ALMFNNELMADVHFWGPPGATRTVPAHKY 

VLAVGSSVFYAMFYGDLAEVKSEIHIPDVEPA 

AFLILLKYMYSDE1DLEADTVLATLYAAKKYI 

VPAI AKACVNFI FT'sLEAKNACVLLSOSRI F 

eepeltqrcwevldaqaemalrsegfceidr 

qtlehvtrealntkeavvfeavlnwaeaec 

kroglpitprnkrhvlgralylvriftmti.ee 

fangaaqsdiltleethsiflwytatnkprld 

fpltkrkgi.apqrchrfqssayrsnqwryrg 

rcdsiqfavdrrvfiaglglygsssgkaeysv 

kielkrlgvvlaqnltkfmsdgssntfpvwf 

ehp vq veqdtfytas a vldgsel s yfgqegm 

tevqcgkvafqfqcssdstngtgvqggqipe 

LIFYA 


765 


2115 


A 


6099 


1 


1150 


S GFTHY ATY DF I VKGS CFCN VHADQC IP VHGF 

RPVKAPGTFHMVHGKCMCKHNTAGSHCQH 

TAPI Y>Jr>RPWFAAr>nKTriAPNRrRTrKr r NG 

HADTCHFDVNVWEASGNRSGGVCDDCQHN 

TEGQYCQRCKPGFYRDLRRPFSAPDACKPCS 

CHPVGSAVLPANSVTFCDPSNGDCPCXPGVA 

GRRCDRCMVGYWGFGDYGCRPCDCAGSCD 

PITGDCISSH'I"DIDWYHEVPDFRPVHNKSEPP 

WEWEDAQGFSALLHSGKCECKEQTLGNAKA 

FCGMKYS YVLKIKJL S AHDKGTHVE VNVKIK 

KVLKSTKLKIFRGKRTLYPESWTDRGCTCPIL 

NPGLEYLVAGHEDIRTGKUVNMKSFVQHWK 

PSLGRKVMDILKRECK 


766 


2116 


A 


6103 


2 


384 


MTAAATATVLKEGVLEKRSGGLLQLWKRKR 

CVLTERGLQLFEAKGTGGRPKELSFARIKAVE 

CVESTGRH1YFTLVTEGGGEIDFRCPLEDPGW 

NAQITLGLVKFKNQQAIQTVRARQSLGTGTL 

VS 


767 


2117 


A 


6106 


I 


542 


SGSSHASDGSGFQELRICSEDQTPL1AGMCSLP 
MARY YI1KYADQKAL YTRD GQLL VGDPVAD 
NCCAEKJCTLPNRGLDRTKVPIFLGIQGGSRC 
LACVETEEGPSLQLEDVNIEELYKGGEEATRF 
TFFQSSSGS AFRLE AAA WPG WFLCGP AEPQQ 
PVQLTKESEPSARTKFYFEQSW 


768 


2118 


A 


6109 


3 


292 


FILQAVLQLSSQEARYKAFGTCV SHI GAIL AF 

YTPSVISSVMIIRVARCAAPHVIIILLANFYLLF 

PPMVNPrrYGVKTKQIRDSLGSIPEKGCVNRE 


769 


2119 


A 


6110 


I 


711 


RHEPSCSNGVASTKSKQNHSKYPAPSSSSSSS 
SSSSSSSPSSVNYSESNSrDSTKSQHHSSTSNQ 
ETSDSEMEMEAEHYPNGVLGSMSTRIVNGAY 
KHEDLQTDESSMDDRHPRRQLCGGNQAATE 
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RIILFGRELQALSEQLGREYGKNLAHTEMLQD 
AFSLLAYSDPWSCPVGQQLDPIQREPVCAAL 
NSAILESQNLPKQPPLMLALGQASECLRLMA 
RAGLGSCSFARVDDYLH 


770 


2120 


A 


6125 


2 


570 


YFGLNLHVQHLGNNVFLLQTI ,FG AVILL ANC 
VAPWALKYMNRRASQMLLMFLLAICLLAIIF 
VPQEMQMLREVLATLGLGASALANTLAFAH 
GNEVIPTIIRARAMGINATFAN1AGALAPLMM 
IL S VYSPPLPWIIYGVFPFI SGFAFLLLPETRNK 
PLFDTIQDEKNERKDPREPKQEDPRVEVTQF 


771 


2121 


A 


6126 


909 


353 


RSFVLDTASAICNYNAHYKNHPKYWCRGYF 
RD YCNIIAF S PN STNHV AL RDTGNQL I VTM S C 
LTKEDTGWYWCGIQRDFARDDMDFTELIVT 
DDKGTLANDFWSGKDLSGNKTRSCKAPKW 
RKADRSRTSILIICILITGLG1ISVISHLTKRRRS 
QRNRRVGNTLKPFSRVLTPKEMAPTEQM 


772 


2122 


A 


6148 


7 


810 


FVLGILAl^HTISPFMNKFFPASFPNRQYQLLF 

TQGSGENKEEIINYEFDTKDLVCLGLSSIVGV 

WYLLRKHWIANNLFGLAFSLNGVELLHLNN 

VSTGCILLGGLFIYDVFWVFGTNVMVTVAKS 

FEAPIKLVFPQDLLEKGLEANNFAMLGLGDV 

VIPGI FIALLLRFDI SLKKNTHTYFYTSF AA Yl F 

GLGLTIFIMH1FKHAQPALLYLVPACIGFPVLV 

ALAKGEVTEMFSYEESNPKDPAAVTESKEGT 

EASASKGLEKKEK 


773 


2123 


A 


6161 


3 


1088 


CQPML VTRKNHPKLLLRRTES V AEKMLTN W 

PIDAITGEARYSLSEDKLIRHLIDYKTLTLNCV 

NPENENAPEVPVKGLDCDTGTQAKEKLLDA 

AYKGVPYSQRPKAADMDLEWRQGRMARI IL 

QDED VTTKJDND W KRLNTL AH YQ VTD G S S V 

ALVPKQTSAYNISNSSTFTKSLSRYESMLRTA 

SSPDSLRSRTPMITPDLESGTKLWHLVKNHDH 

LDQREGDRGSKMVSEIYLTRLLATKGTLQKF 

VDDLFETIFSTAHRGSALPLAIKYMFDFLDEQ 

ADKHQIHDADVRHTWKSNCLPLRFWVNVIK 

NPQFVFDIHKNSITDACLSW 


774 


2124 


A 


6163 


860 


125 


KTAVKKRNLNPVFNETLRYSVPQAELQGRVL 

SLSVWHRESLGRNIFLGEVEVPLDTWDWGSE 

PTWLPLQPRVPPSPDDLPSRGLLALSLKYVPA 

GSEGAGLPPSGELHFWVKEARDLLPLRAGSL 

DTYVQCFVLPDDSRASRQRTRWRRSLSPVF 

NIFTMVYDGFGPADLRQACAELSLWDHGALA 

NRQLGOTRLSLGTGSSYGLQVPWMDSTPEEK 

QLWQALLEQPCEWVDGLLPLRTNLAPRT 


775 


2125 


A 


6191 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAAD 
YVRSKDFRDYLMSTHFWGPVANW GLPIAAIT 
DMK\KSPEnSRRMTFAL*CYSLTFVRFAHYVQ 
\PWNVVLMLGCHTAVDFDQLISSMPCISHGMT 
ASASAL 


776 


2126 


A 


6217 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMK1T 
RQKHAKKHLGFFRNNFGVREPYQILLDGTFC 
QAALRGR1QLREQLPRYLMGETQLCTTRCVL 
KELETLGKDLYGAKLIAQKCQVRNCPHFKNA 
VSGSECLLSMVEEGNPHHYFVATQDQNLSVK 
VKKJCPGVPLMFIIQNTMVLDKPSPKT1AFVKA 
VESG\RLSC^MRKKVSNISKRNRV**KTLNRG 
RJUCKJ*KXISGPNPLSCLKJOCKKAPDTQSSASE 
KKRKRKJURNRSNPKVLSEKQNAEGE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 

uence 


Met 
bod 


SEQ 
ID NO: 
in 
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09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, 
D— Aspartic Acid, E— Glutamic Acid, 
F=Pheny I alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N= Asparagi ne , P=Proline, 
Q=Glutamine, R=Arginine t S=Serine, 
T=Threonine, V=V aline, W^Tryptophan, 
Y-Tyrosine, X=Unknown, *-Stop codon, 
/=possibie nucleotide deletion, \=possible 
nucleotide insertion 


777 


2127 


A 


6236 


1038 


1402 


YYQIS SLPS I VGNG 1FL WLLICIFL AKQGGSRL * 
FQPFGRPRGGGKLRSGVLGQPGQHGETP/SFF 
YNSKISPALWGPPVITSALGGEAGKSL*PRRQ 
RFQRGGIAPLPSRVRGRAKLFLKKK 


778 


2128 


A 


6237 


422 


913 


ASFFHHHRG AFLLLLAIPGS* GQDQSLIHWSN 
AVSNADVLLDLK\N*LDHU>EEKMPLVEVKVVP 
PQVL\SEPN*RSGGCFSAPSFEVPPWTGEVKP/ 
SPQRDGGALG\QGPLGIPSDSILALLKKQT*RA 
LLNWPLGSLRRSSCFGGQDGQDLKPRSGLGC 
NSFRYRR 


779 


2129 


A 


6249 


420 


36 


ARAPSPSFSVRDVELSDPARERGEMPVAVGP 
YGQSQPSCFDRVKMGFVMGCAVGMAAGAL 
FGTFSCLSS1LVSSSG/SGMRGRELMGGIGKTM 
MQSGGTFGTFMA1GMGIRC* PWLFTTS VPSH 
QSQPMY 






A 


6263 


4.1 S 


1380 


RiMWMrnHfimMT rrrvnAFAAF^i mtiavh 

jfAJiiyjLLVi vi^i-^xv.VwiJi v^jlt i,i iX i i v vjjTJ jl^ivi i jutv v vj 

TD Y \VL Y SRGVCRTKSTSDNETSRKNEE VMT 
HSGLWRTCCLEGAFRGVCKKIDHFPEDADYE 
QDTAE YLLRA VRAS S VFPL^S VTLLFFGGLCV 
AASEFHRSRHNVILSAGIFFVSAGLSNIIGirVYI 
S\ANAGRTPGQR\DSKKSYSYGWSF/YFSGAFS 

LKKSTFARJLPPYRYRFRRRSSSRSTEPRSRDLS 
PISKGFHTIPSTDISMFTLSRDPSKrrMGTLLNS 
D RD HAFL Q FHN STPKXFKE S LHNNP ANRRTT 
PV 


781 


2131 


A 


6274 


832 


318 


RIIKVKDLKQTLAIKTAYPRCKCLVEMDQIFH 
LQVKQKQLACLCTWQARDPDCPPSTKWL/L 
VGPGMGCMVALFQDSIAWSNKSMPSSLSAIS 
QSPCQVQAPEGPSSFHLPTLSFTTCLSWQGGD 
LEFLGDLKGCSELKNFQEL1TQSALVHPKADV 
WWYPfiRPT I ATI PSN 


782 


2132 


A 


6281 


1324 


393 


WISLPSSLLCRKNGSSAEDDRRXGEPSAEEAEG 

ER£DWGIGSA*SVGAVSK.VFSARF*RTYPS\E 

DEEEVTHQKSSSSDSNSEEHRKKKTSRSRNK 

KKRKNKSSKRKHRKYSDSDSNSESDTNSDSD 

DDFGCRVKAKKKKXKKKHKTKKKKNKKTKK 

ES SDSSCKDSEEDLSEATWMEQPNVADTMDL 

IGPEAPI1HTSQDFKPI.KYGHALLPGEGAAMA 

EYVKAGKRIPRRGEIGLTSEEIGSFECSGYVM 

SGSRHRRMEAVRLRKENQrYSADEKRALASF 

NQEERRKRESKILASFREMVHKKTKGKDDK 


783 


2133 


A 


6305 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQ 

I ,RGITGPA WYCHSPSHS1 J SAFCHLFTPSRCP 

AMARPPVPGSV WPN WHES/RRGQG VPGLHS 

AQEPPAGVWAA*AASAAAA\LSTDTASYKIFV 

SGKSGVGKTALVAKLAGLEVPWHHETTGIQ 

TTVVFWPAKLQASSRVVMFRFEFWDCGESA 

LKKFDHMLLACMENTDAFLFLFSFTDRASFE 

DLPGQLARIAGEAPGWRMVIGSKFDQYMHT 

DVPERDLTAFRQAWELPLLRVKSVPGRRLG 


784 


2134 


A 


6308 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSNPKSSP 

GQPEAGPEGAQERPS QAAPA VEAEGPGSSQA 

PRKPEGAQARTAQSGALRDVSEELSRQLEDIL 

STYCVDNNQGGPGEDGAQGEPAEPEDAEKSR 

TYVARNGEPEPTPVVNGEKEPSKGDPNTEEIR 

QSDEVGDRDHRRPQEKKKAKGLGKEITLLM 

QTLNTLSTPEEKLAALCKKYAELLEEHRNSQ 

KQMKLLQKKQSQLVQEKDHLRGEHSKAVLA 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
D-Aspartic Acid, E-=Glutamic Acid, 
F— Phenylalanine, G— Glycine, H=*Histidine, 
I=Iso leucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Un known, *-Stop codon, 
/—possible nucleotide deletion, \=possible 
nucleotide insertion 














RSKLESLCRELQRHNRSLKEEGVQRAREEEE 

KJRKEVTSHFQVTLNDIQLQMEQHNERNSICLR 

QENMELAERLKKLIEQYELREEH1DKVFKHK 

DLQQQLVDAKLQQAQEMLKEAEERHQREKD 

FLLKEAVESQRMCELMKQQETIiLKQQLALY 

TEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETI'MYRSRWESSNKALLEMAEEKTV 

RDKELEGLQVK1QRLEKLCRALQT/GAQ*PVR 

GQRWGSHRTSAVR1FS 


785 


2135 


A 


6319 


1493 


889 


SPQGPLLRSVSPVSAGASSVTPGGAQPGVTTT 
PPSLV A VAP APGS AAGPAAG WQ* HAGCR/WT 
KJLPWSWGMRPMKIFFSEEYRSISTRISHDAL* 
EKCTQPAKPLSMIR\TGSSVSPG/PLVKWNWT 
RREFRNSGTR W S S CCGM SCM YSFLGHCS V/S 
QDLPL VH VD VG WQPPLG PTVG LRPGLLPLHD 
TTPCQKLVVDDLDWA 


786 


2136 


A 


6320 


551 


135 


RWLPVAECDSSCVGCTGEGPGNCKECISGYA 
REHGQC AD VDECSL AEKTCV RKNENC YNTP 
GSYVCVCPDGFEE77RRCLCAAGRG* SHRRRK 
PDT AALPRRP VMCRTYPLNYS EG CPVENVAL 
RMPSPAVDSGGERLPAL 


787 


2137 


A 


6330 


1693 


227 


DYV LT AELHRQ RSPG VSFGL S VFNLMN AIMG 
SGILGLAYVMANTGVFGFSFLLLTVALLASYS 
VHLLLSMCIQTAYLGP*TNYFMVLPAH*LTCL 
PLIEFLQSL*NSL\*AVTSYEDLGLFAFGLPGKL 
VVAGTIITONIGAM^YT I IIKTFT PAATAFFT T 

gdy sry w yldgqtlli iic vgi vfplallpkjg 
flg yts slsfffmmff al wtikk wsipc pi ,ti . 
n y vekgfqi snvtddckpkjlfhfskes a y alp 
tmafsflchtsilpiycelqspskkrmqnvtn 
taialsfliyfisalfgyltfyd/gttkaqrge 
vtchrikdkvesellkg* **ip* shdvwmtw 
klcilfavllstvpliflfparkavtmmffsnfp 
fs wirhflitl alniii vll aty vpdirnvfg vv 
g aststclififpglfylkx.sredfls wkklg v 
gcfc/llsfktsilrnslsvy1ilpasrksiyfki 


788 


2138 


A 


6351 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTM 

SASFVPNGASLEDCHCNLFCLADLTGIKWKK 

YVWQGPTSAPILFPVTEEDPILSSFSRCLKADV 

LG/VWRRDQRPERREAL* IFWGGEDPWLLTLF 

TMTYQKJCKMECGRMDFPMNAVLCFSKAVH 

NLLERCLMNRNFVRIGKWFVKPYEKDEKPrN 

KSEHLSCSFTFFLHGDSKVCTSVEINQHQPVY 

LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQ 

AFKMSDSATKKLIGEWKQFYPISCCLKEMSE 

EKQEDMDWEDDSLAAVEVLVAGVRMTYPAC 

FVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPAS 

TRDPAMSSVTLTPPTSPEEVQTVDPQSVQKW 

VKFSSVSDGFNSDSTSHHGGKIPRKLAN1 IV V 

DRV WQECNMKRAQNKRKYS AS SGGLCEEAT 

AAKVASWDFVEATQRTNCSCLRHKNLKSRN 

AGQQGQAPSLGQQQQILPKHKTNEKQEKSEK 

PQKRPLTPFHHRVSVSDDVGMD\ADS\ASQRL 

V\I SAP\DSQ\VRFSNIR\TNDV AKXTPQMHGTE 

MANSPQPPPLSPWCDWDEGVTKTPSTPQS 

QHFYQMPTPDPLVPSKPMEDRIDSLSQSFPPQ 

YQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESV 

TSVTELMVQCKKPLK.VSDELVQQYQIKNQCL 

SAIASDAEQEPKJDPYAFVEGDEEFLFPDKKD 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


seQ 

ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystetne, 
D-Aspartic Acid, E*<jlutamic Acid, 
F=Phenylalanine, CHGIycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagtne, P= Proline, 
O^Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=*Valine, W=Tryptophan, 
Y=Tyrosine, X— Unknown, *™Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














RQNSEREAGKKHKVEDGTSSVTVLSHEEDA 

MSLFSPSIKQDAPRPTSHARPPSTSLIYDSDLA 

VSYTDLDNLFNSDEDELTPGSKRSANGSDDK 

ASCKESKTGNLDPL^CISTADLHKMYFTPPSL 

EQHIMGFSPMNMNNKEYGSMDTTPGGTVLE 

GNSSSIGAQFK1EVDEGFCSPKPSEIKDFSYVY 

KPENCQILVGCSMFAPLKTLPSQYLPLIKLPEE 

CTYRQS WTVGKJLELL S SGPSMPFIKEGDGSNM 

DQEYGTAYTPQTHTSCGMPPSSAPPSNSGAGI 

LPSPSTPRFPTPRTPRTPRTPRG AGGPAS AQG S 

VKYENSDLYSPASTPSTCRPLNSVEPATVPSIP 

EAHSLYVNLILSESVMNLFKDCNSDSCC1CVC 

NMNIKGADVGVY1PDPTQEAQYRCTCGFSAV 

MNRKFGNNSGLFFEDELD1IGRNTDCGKEAE 

KRFEALRATSAEHVNGGLKESEKLSDDLILLL 

QDQCTNLFSPFGAADQDPFPKSGVISNWVRV 

EFJU5CCNDCYLALEHGRQFMDNMSGGKVDE 

ALVKSSCLHPWSKRNDVSMQCSQDILRMLLS 

LQPVLQDAIQKXRTVRPWGVQGPLTWQQFH 

KMAGRGSYGTDESPEPLPIPTFLLGYDYDYLV 

LSPFALPYWERLMLEPYGSQRDIAYWLCPE 

NE ALLNG AKSFFRDLT AIYESC RLGQHRP VSR 

LLTDGIMRVGSTASKKLSEKLVAEWFSQAAD 

GNNEAFSKLKLYAQVCRYDLGPYLASLPLDS 

SLLSQPNLVAPTSQSLITPPQMTNTGNANTPS 

ATLASAASSTMTVTSGVAISTSVATANSTLTT 

ASTSSSSSSNLNSGVSSNKLPSFPPFGSMNSNA 

AGSMSTQANTVQSGQLGGQQTSALQTAGISG 

ESSSLPTQPHPDVSESTMDRDKVGIPTDGDSH 

A VTYPP AI WYIIDPFTYENTDESTN SSSVWTL 

GLLRCFLEMVQTLPPHIKSTVSVQIIPCQYLLQ 

PVKHEDREIYPQHLKSLAFSAFTQCRRPLPTS 

TNVKTLTGFGPGLAMETALRSPDRPECIRLYA 

PPFILAPVKDKQTELGETFGEAGQKYNVLFV 

GYCLSHDQRWILASCTDLYGELLETCIINTDVP 

NRARRKKSS ARKFGLQKL W EWCLGL VQMS S 

SLSKRLKDMCRMCGI S AADSPSILS ACL V AM 

EPQGSFVIMPDSVSTGSVFGRSTTLNMQTSQL 

NTPQDTSCTI IILVFPTS ASVQ VAS ATYTTENL 

DLAFNFNNDGADGMGIFDLLDTGDDLDPDII 

NDLPASPTGSPVHSPGSHYPHGGDAGKGQSTD 

RLLSTEPHEEVPNILQQPLALGYFVSTAKAGP 

LPDWFWSACPQAQYQCPLFLKASLHLHVPSV 

QSDELLHSKHSHPLDSNQTSDVLRFVLEQYN 

ALSWLTCDPATQDRRSCLPEHFWLNQLYNFI 

MNML 


789 


2139 


A 


6359 


1 


2002 


TGTLTEDGLDVMGWPLKGQAFLPLVPEPRR 

LPVGPLLRALATCHALSRLQDTPVGDPMDLK 

M VE STG W VLEEEP AAD S AFGTQ VLA VMRPP 

LWEPQLQAMEEPPVPVSVLHRFPFSSALQRM 

SVWAWPGATQPEAYVKGSPELVAGLCNPET 

VPTDFAQMLQSYTAAGYRVVALASKPLPSVP 

SLEAAQQLTRDTVEGDLSLLGLLVMRNLLKP 

QTTPVIQALRRTRIRAVN1VTGDNLQTAVTVA 

RGCGMVAPQEHLnVHATHPERGOPASLEFLP 

MESPTAVNGVKDPDQAASYTVEPDPRSRHLA 

LSGPTFGIlVKriFPKLLPKVLVQGTVFARMAP 

EQKTELVCELQKLQYCVGMCGDG ANDCGAI . 

KAAD VGISL SQAE AS VVSPFTSSMA SIEC VPM 
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seq- 
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NO: of 
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seq- 
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hod 
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in 
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nucleotide 
location 
correspond i 
ng to first 
amino acid 
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sequence 
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nucleotide 
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to last amino 
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of peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, 
D=Aspartic Acid, E-^Olutamic Acid, 
F=Phenylalaninc, OKjlycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=Glutamine f R=Arginine, S=Serine, 
T=Threonine, V^VaJine, W^Tryptophan, 
Y-Tyrosine, X=-Unknown, *— Stop codon, 
/ = possiblc nucleotide deletion, \=^ossible 
nucleotide insertion 














VTREGRCSLDTSFS VFK YMALY SLTQFI S VLIL 

YTINTNLGDLQFLAIDLVITTTVAVLMSRTGP 

ALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLG G YFLTL AQPWFVPLNRTV AAPDNL FNY 

KNTWFST,SSFQYT.n,AAAVSKGAPFR\RPLTN 

NVPFLLASAL* SSVLWLVLSPGLLHGPLALR 

NITDTGFKLLLVGLVTLNFVGGLHAGERARP 

VPPRLPAPPPAQAG\SKKRFKQLERELAEQPW 

PPLPAGPLR 


790 


2140 


A 


£1 OA 


70 




GSRLPAGTSGSRGHCGPCRFRGFEVMGNPGT 

FKRGLLL S AL S YLGFETYQVI SQAA V VHAT A 

KVEEILEQADYLYESGETEKLYQLLTQYKESE 

DAELLWRLARASRDVAQLSRTSEEEKKLLVY 

EALEYAKRA/L/EKNESSFASHKWYAICLSDV 

GDYEGIKAK1ANAYTTKEHFEKAIELNPKDATS 

IHI.MGIWCYTFAEMPWYQRRIA'NACLQLPP 

*FPPYEKALG\YFHRAEQVDPNFYSKNLLLLG 

KTYLKLHNKKLAAFWLMKAKDYPAHTEED 

KQIQTEAAQLLTSFSEKN 


791 


2141 


A 


6434 


3 


1460 


1ALLIVDGLAWDDQGGLALLHISPSKLIL*QDS 

SGMS/YVMVRCTITRAFFKSLLCHICQYSIGPQ 

•VT\CPGQDACKE*KSTAN*GG*RE**PQVLFF 

AFLSNPAVKFGRMSKKQRDSLYAEVQKHQQ 

RLQEQRQQQSGEAEALARVYSSSISNGLSNLN 

Nil 1 b(j 1 Y ANUavlL>LJrK.bfcAJ Y YIN V VbUyrar 

DQSGLDMTvGlKQIKQEPIYDLTSVPNLFTYXSS 

FNMGQLAPGnwrErDRIAQNriKSHLETCQY 

TMEELHQLAWQTHTYEEIKAYQSKSREALW 

QQCAIQITHAIQYWEFAKRITGFMELCQNDQ 

ILLLKSGCLEWLVRMCRAFNPLNNTVLFEG 

KYGGMQMFKALGSDDLVNEAFDFAKNLCSL 

QLTEEEIALFSS AVLI SPDRA WLIEPRKVQKLQ 

pvtvtat nHvrnifNHi nnFTT akt taktpttta 

VCNLHGEKLQVFKQSHPEIVNTLFPPLYKELF 

NPDCATACK 


792 


2142 


A 


6440 


92 A 


781 


SRGTFRCFC1U)FFPCFSNMRLFLWNAVLTLFV 

TSLIGAL1PEPEVKIEVLQKPFICHRKTKGODL 

MLVHYEGYLEKDGSLFHSTHKHNNGQPIWFT 

LGILEALKGWGPGA*IUDMCVGEKRKLnPPA 

LGYGKEGKGKIPPESTLIFN1DLLE1RNGPRSH 

ESFQEMDLNDDWKLSKDEVKAYLKKEFEKH 

GAVVNESHHDALVEDIFDKEDEDKDGFISAR 

EFTYKHDEL 


793 


2143 


A 


6446 


3201 


152 


PRLKRLVVTEEDGGARPEALGKIAPRTPAELG 

ARADQELVTALMCDLRRPAAGGMMDLAYV 

CEWEKWSK^THCPSVPLACAWSCRNLIAFTM 

DLRSDDQDLTRMIHILDTEHPWDLHSIPSEHH 

EAITC\LEWDQSGFPGFLFSRWPTGQ1K\CWS 

MGVSTLA\NSWE\SSVGSL\VEGGPHLWALS\ 

WLHuMGVKLALHVEKSGASSFGEKFSRWKFS 

FASLTLF\GGNAMEGWIAVTVSGLVTVSLLQ\P 

SGQ VL\TST£SLCRLRARV ALADI AFTGGGN 1 

WATADGSSAXSPVQFYKVCVSVVSEKCRTDT 

DILPSL^MRCTTDLNRKDKFPAITHLKFLARD 

MSEQVLLCASSQTSSIVECWSLRKEGLPVNNI 

FQQISPWGDKQPTILKWRILSATNDLDRVSA 

VWLPFCLPISLTNTDLKVASDTQFYPGLGLAL 

AFHDG S VHIVHRLSLQTMA VF Y S S AAPRPVD 

EPAMKRPRTAGPAVHLKAMQLSWrSLALVG 
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sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G-Glycine, H-Histidine, 
I— Isoleucine, K— Lysine, L=Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q/=Glutamine, R=Arginine, S=Serine, 
r=Threonine J V=Valtne, W=Tryptophan, 
Y=Tyrosine, X=Unknown, +=Stop codon, 
/=possible nucleotide deletion, \-possible 
nucleotide insertion 














IDSHGKLSW.RLSPSMGHPLEVGLALRHLLFL 

LE YCMVTG YDWWD1LLHV QPSM VQSL VEKL 

HEEYTRQTAALQQVLSTRILAMKASLCKLSP 

CTVTRVCD YHTKLFLIA1SSTLKSLLRPI IFLNT 

PDKSPGDRLTEICTKJTDVDIDKVMINLKTEEF 

VLDMMTLQALQQLLQWVGDFVLYLLASLPN 

QPCPTSEPCPTSEPSPTSEPSPTSEPSSP* SLCVG 

SLLJ^GHSFLRDGTSLGMLRELMVVIRIWGLL 

KPSCLPVYTATSDTQDSMSLLFRLLTKLWICC 

RDEGPASEPDEALVDECCLLPSQLLIPSLDWL 

P ASDGLV SRLQPKQPLRLQFGRAPTLPGSAAT 

LQLDGLARAPGQPKroHLRRLHLGACPTEEC 

KACTRCGCVTMLKSPNRTTAVTCQWEQRW1K 

NC/LVRWALVAGAPQLPLSPAAPQLLLSYPSA 

APEPGCCKSHRSPWTLLGAVNLSPPCRAVEG 

RGPDACVTSRASEEAPAFVQLGPQSTHHSPRT 

PRSLDHLHPEDRP 


794 


2144 


A 


6490 


418 


585 


NGDKADLENESCRAQVLMPWPALWEAEGG 
GSLEPRDLRLQ*AVITPL\TPAWVTQ 


795 


2145 


A 


6499 


395 


1027 


KLLWLPPHSEQKRSPLYHPQGPSGTTPSAPXFS 

SHSPPPSLLQAXPSIAAFLRTHGHISASGPLRMP 

FPH/H*NAFLLVFPGQRSQLTS/PSHYLCREVFP 

DHHHHLCRLSLESSPLFHHRVLFCVPKQNVN 

STRAQIFCLFVHrVGCRCINTFPLHLFRLHLWL 

HFLQIPLCKKNKSVKLGKTWGRGCQSAAGS 

DTRVRAAVGAPGLPVEPLV 


796 


2146 


A 


6503 


68 


936 


I ISALLTHS SFC VFTLCQDFFTYSSMSEEVTYA 

DLQFQNSSEMEKIPEIGKFGEKAPPAPSHVWR 

PAALFLTLLCLLLLIGLGVLASMFHVTLKLIEM 

KXMNKLQNISEELQRNISLQLMSNMN1SNKIR 

N LSTTLQTIATKLCREL YSKEQEKKCKPCPRR 

WIWHKDS C YFLSDD VQTWQESKMACAAQN 

A SLLKINNKNALEFIKSQ SRS YDY WLGLSPEE 

DS/YSWYESG*YNQ\PSAWVIRNAPDLNNMY 

CGYINRLYVQYYHCTYKQRMICEKMANPVQ 

LGSTYFREA 


797 


2147 


A 


6507 


I 


881 


PGSTHASARSQVPRSAGEAAPHSRRPPGLLPH 

APRAASAQLEERMRDPHPGMTLQEGDCRGS 

QTVSLTMGTADSDEMAPEAPQHTHIDVHIHQ 

ESALAKLLLTCCSALRPRATQARGSSRLLVAS 

WVMQIVLGILSAVLGGFFYIRDYTLLVTSGA 

AIWTGAVAVLAGAAAFIYEKRGGTYWALLR 

TLLALAAFSTAIAALKLWNEDFRYGYSYYNS 

ACRISSSSDWNTPAPTQSPEEVRRLHLCTSFM 

DMLKALFRTLQAMLLGVWILLLLASLTPLWL 

/SL/RGECSQPKG* VPKKRDQKEMLEVSGI * PG 

STHAS ARSQ VPRSAGEAAPHSRRPPGL LPHAP 

RAASAQLEERMRDPHPGMTLQEGDCRGSQT 

VSLTMGTADSDEMAPEAPQHTHIDVHIHQES 

ALAKLLLTCCSALRPRATQARGSSRLLVASW 

VMQIVLGILSAVI.GGFFYIRDYTLLVTSGAAr 

WTGAVAVLAGAAAFIYEKRGGTYWALLRTL 

LALAAFSTAIAALKJLWNEDFRYGYSYYNSAC 

RISSSSDWNTPAPTQSPEEVRRLHLCTSFMDM 

LKALFRTLQAMLLGVWILLLLASLTPLWLYC 

WRMFPTKGVSP 


798 


2148 


A 


6528 


912 


2287 


VPNYLPSVSSA1GGEVPQRYVWRFCIGLHSAP 
RFLVAF AYWNHYLSCTSPCSCYRPLCRJLNFG 
L NW ENL AL L VLT YVS S S EDF/TWVPG * G RS G 
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EVFPEGTGLPLPH SDLPTSWCG HSLQCGSQSS 
FPPAIHENAFIVFIASSLGHMLLTCILWRLTKK 
HTVSQE\DGLSLAGAPRQPRRKSRTSVLRIRV 
MVRWELSSNGNPGRGVLGLGLGLGNKLRVV 
GQNLGL* HCVWVVWETGE*KRWRJLQMGEE* 
GV ASRRQ* VRNSVRGLVCHNSSAPPMYMGFF 
SPTVFGGGVGG* LHVTFILHPPEVEAAGIPLLL 
GPSLPQRQGREHIWILAAPACAPFHDR*WEP 
REIRPSP* ELG LRGEPTLSYP ASCRVTRQPIP'* D 
RKSYSWKQRLFirNFISFFSALAVYFRHNMYC 
EAGVYT7FAILEYTWLTNMAFHMTAWWDF 
GNKELL ITSQPEEKRF 


799 


2149 


A 


6529 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWS 

CCLVQGGGDLVDVVQTNHGEDEAGGDTDSV 

DEARCKESQQEAQENLREDLCLESFAKDKIL 

QIIEGSEREHEETRTKQ A A I -DGEPLGGGQLTA 

VHLHPSKEQQGQEGGERQRGARTHHWRGW 

EKGRRVRLRPPSGKLRADQPVRKLGGPTPS.T 

ELPGLQPHAPTPHTA/PATPTYSPAPDTPNPPV 

RWKCPLPVEPRTRQLCRERTRKACPPKPRPPL 

GLPGDPTGPVTHHAPPVSPTGASGQERRAEP 

GAVSYAHASATK 


800 


2150 


A 


6544 


2 


662 


SAQRWAAVAGRWGCRLLALLLLVPGPGGAS 

EITFELPDNAKQCFYED1AQGTKCTLEFQVTTG 

GHYDVDCRLEDPDGKVLYKEMKKQYDSFTF 

TASKNGTYKFCFSNEXFSTrTHKTVYFDFQVG 

FATHLCFLVR/DRVSALTQMESACVSIHEALKS 

VTDYQTHFRLREAQGRSRAEDLNTRVAYWSV 

GEAXILLVVSIGQVFELKJSFFSDKJITTTTRVGS 


801 


2151 


A 


6556 


1 


1319 


TPCMECIKGEGLREPQNLSGSQREPQTEGSM 

DGWRRMPRWGLLLLLWGSCTFGLPTDTTTF 

KR1FLKRMPSIRESLKERGVDMARLGPEWSQP 

MKRLTLGNTTS S VILTNYMDTQ YYGEIGIGTP 

PQTFKWFDTGSSNVWVPSSKCSRLYTACVY 

HKJLFDASDS S S YKHNGTELTLRYSTGTVSGFL 

SQDUTVGGITVTQMFGEVTEMPALPFMLAEF 

DGVVGMGFIEQAIGRVTPIFDNnSQGVLKED 

VFSFYYNRDSENSQSLGGQIVLGOSDPQHYE 

GKFHYINLIKTGVWQIQMKGVSVGSSTLLCE 

DGCLALVDTGASYISGSTSSIEKLMEALGAKE 

KRLFDYVVKCNEGPTLPPTFLFLLGGKDTPLT 

SADYLFQESYSSKKLSTLAfflAMYTPPPTGPTL 

\AL GATF\ IRK FYTE FD RGNNPHG F AL AR 


802 


2152 


A 


6567 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSL 

LAVVVLLAIJPVAWGQCNAPEW\LPFARPTNL 

TDEFEFPIGTYLNYECRPGYSGRPFSIICLKNS 

VWTGAKDRCRRKSCRNPPDPVNGMVHVIKG 

IQFGSQIKYSCTKGYRLIGSSSATCnSGDTVIW 

DNFTPICDRIPCGLPPTTTOGDnSTNRENFHY 

GSVVTYRCNPGSGGRKVFELVGEPSIYCTSND 

DQVGIWSGPAPQCIIPNKCTPPNVENGILVSD 

NRSLFSLNEWEFRCQPGFVMKGPRRVKCQA 

LNKWEPELPSCSRVCQPPPDVLHAERTQRDK 

DNFSPGQEVFYSCEPGYDLRGAASMRCTPQG 

D W SPAAPTCE VKS CDDFMGQLLNGR VLFPV 

NLQLGAKVDFVCDEGFQLKGSSASYCVLAG 

MESLWNSSVPVCEQrFCPSPPVTPNGRHTGKP 

LEVFPFGKAVNYTCDPHPDRGTSFDLIGEST1R 

CTSDPQGNGVWSSPAPRCGILGHCQAPDHFL 

FAKLKTQTNASDFPIGTSLKYECRPEYYGRPF 
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SITCLDNLVWSSPKDVCKRKSCKTPPDPVNG 

MVHVITDIQVGSRINYSCTTGHRLIGHSSAECI 

LSGNAAHWSTKPPICQRIPCGLPPT1ANGDFIS 

TNRENFHYGSWTYRCNPGSGGRKVFELVGE 

PS I YCT SNDDQVG IWSGPAPQCI IPNKCTPPN V 

ENGILVSDNRSLFSLNEWEFRCQPGFVMKGP 

RRVKCQA1NKWEPELPSCSRVCQPPPDVLHA 

ERTQRDKDNFSPGQEVFYSCEPGYDLRGAAS 

MRCTPQGDWSPAAPTCEVKSCDDFMGQLLN 

GRVLPPVNLQLGAKVDFVCDEGFQLKGSSAS 

YCVLAGMESLWNSSVPVCEQIFCPSPPVIPNG 

RHTGKPLEVFPFGKAVNYTCDPHPDRGTSFD 

LIGESTIRCTSDPQGNGVWSSPAPRCGILGHC 

QAPDHFLFAKLKTQTNASDFPIGTSLKYECRP 

EYYGRPFSITCLDNLVWSSPKDVCKRKSCKTP 

PDPVNGMVHVITDIQVGSRINYSCTTGHRLIG 

HSSAECILSGNTAHWSTKPPICQRIPCGLPPTI 

ANGDFISTNRENFHYGSWTYRCNLGSRGRK 

VFELVGEPSiYCTSNDDQVGIWSGPAPQCIIPN 

KCTPPNVENGILVSDNRSLFSLNEVVEFRCQP 

GFVMKGPRRVKCQALNKWEPELPSCSRVCQ 

PPPEILHGEHTPSHQDNFSPGQEVFYSCEPGY 

DLRGAASLHCTPQGDWSPEAPRCAVKSCDDF 

LGQLPHGRVLFPLNLQLGAKVSFVCDEGFRL 

KGSSVSHCVLVGMRSLWNNSVPVCEHIFCPN 

PPAILNGRHTGTPSGDIPYGKE1SYTCDPHPDR 

GMTFNLIGESTIRCTSDPHGNGVWSSPAPRCE 

LSVRAGHCKTPEQFPFASPTIPINDFEFPVGTS 

LNYECRPGYFGKMFS1SCLENLVWSSVEDNC 

RJIKSCGPPPEPFNGMVHINTDTQFGSTVNYSC 

NEGFRLIGSPSTTCLVSGNNVTWDKKAPICEII 

S CEPPPT1 SNG D F Y SNN RTS FHNGT WTY QCH 

TGPDGEQLFELVGERSIYCTSKDDQVGVWSS 

PPPRC1STNKCTAPEVENAIRVPGNRSFFSLTEI 

IRFRCQPGFVMVGSHTVQCQTNGRWGPKLPH 

CSRVCQPPPErLHGEHTLSHQDNFSPGQEVFY 

SCEPSYDLRGAASLHCTPQGDWSPEAPRCTV 

KSCDDFLGQLPHGRVLLPLNLQLGAK V SFVC 

DEGFRLKGRS ASHCVL AGMKAL WNSS VP VC 

EQIFCPNPPA1LNGRHTGTPLGDIPYGKEVSYT 

CDPHPDRGMTFNLIGEST1RRTSEPHGNGVWS 

SPAPRCELPVGAACPHPPKIQNGHYTGGHVSL 

YLPGMTISYTCDPGYLLVGKGFIFCTDQGIWS 

QLDHYCKEVNCSFPLFMNGISKELEMKKVYH 

YGDYVTLKCEDGYTLEGSPWSQCQADDRWD 

PPLAKCTSRTHD AL I VGTLSGTIFFILLIIFLS Wl 

ILKHRKGNNAHENPKEVA1HLHSQGGSSVHP 

RTLQTNEEN SRVLP 


803 


2153 


A 


6574 


2 


3233 


HGRSARLAAVPAEAMPGPRRPAGSRLRLLLL 

LLLPPLLLLLRGVSHAGNLTVA WLPLANTS Y 

PWSWA\RVGPAVELALAQVKARPDLLPGWT 

VRTVLGSSENALGVCSDTAAPLAAVDLKWE 

HNPAVFLGPGCVYAAAPVGRFTAHWRVPLL 

TAGAPALGFGVKDEYALTTRAGPSYAKLGDF 

VAALHRRLGWERQALMLYAYRPGDEEHCFF 

LVEGLFMRVRDRLNITVDHLEFAEDDLSHYT 

RLLRTMPRKGRVIYICSSPDAFRTLMLLALEA 

GLCG ED YVFFHLDLFGQSLQGGQGP APRRP W 

ERGDGQDVSARQAFQAAKIITYKDPDNPEYL 

EFLKQLKHLAYEQFWTMEDGLVNTTPASFH [ 
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DGLLLYIQAVTETLAHGGTVTDGENITQRMW 

NRSFQG VTG YLKID S SGDRETDFSL WDMDPE 

NGAFRWLNYNGTSQELVAVSGRKLNWPLG 

YPPPDLPKCGFDNEDPACNQDHLSTLEVLALV 

GSLSLLGELIVSFFIYRKMQLEKELASELWRVR 

WEDVEPSSLERHLRSAGSRLTLSGRGSNYGSL 

LTTEGQFQVFAKTAYYKGNLVAVKRVNRKR 

IELTRKVLFELKHMRDVQNEHLTRFVGACTD 

PPNICtLTEYCPRGSLQDILENESITLDWMFRY 

SLTNDIVKGMLFLHNGAICSHGNLKSSNCVV 

DGRFVLKITDYGLESFRDLDPEQGHTVYAKK 

LWTAPELLRMASPPVRGSQAGDVYSFGIILQE 

lALRSGVFHVEGLDLSPKEUERVTRGEQPPFR 

PSLALQSHLEELGLLMQRCWAEDPQERPPFQ 

QIRLTLRKFNRENSSNILDNLLSRMEQYANNL 

EELVEERTQAYLEEKRKAEALLYQILPHSVAE 

QLKRGETVQAEAFDSVTIYFSDIVGFTALSAE 

STPMQ VVTLLNDL YTCFDA VIDNFD VYK VET 

IGDAYMVVSGLPVRNGRLHACEVARMALAL 

LDAVRSFRTRHRPQEQLRLRIGIHTGPVCAGV 

VGLKMPRYCLFGDTVNTASRMESNGEALVKI 

HLSS\ETKAVL\EEFGGFELELRGDVEMKGKG 

KVRTY WLLGERG SSTRG 


804 


2134 


A 


6585 


2 


3837 


DAPGRPPVRLPTMELEDGVVYQEEPGGSGAV 

MSERVSGLAGSIYREFERLIVRYDEEVVKELEP 

LVVAVLENLDSVFAQDQEHQVELELLRDDNE 

QLITQYEREKALRKHAEEKFIEFEDSQEQEKK 

DLQTRVESLESQTRQLELKAKNYADQISILEE 

REAELKKEYNALHQRHTEMIHNYMEHLERT 

KLHQLSGSDQLESTAHSRIRKERP1SLGIFPLP 

AGDGLLTPDAQKGGETPGSEQWKFQELSQPR 

SHTSLKDELSDVSQGGSKATTPASTANSDVA 

TI PTDTPLKEENE GF VX VTD APNK S EJSKH IE V 

QVAQETRNVSTGSAENEEKSEVQAIIESTPEL 

DMDKDLSGYKGSSTTTKGIENKAFDRNTESL 

FEELSSAGSGUGDVDEGADLLGMGREVENLI 

LENTQLLETKNALNIVKNDLIAKVDELTCEK 

DVLQGELEAVKQAKLKLEEKNRELEEELRKA 

RAEAEDARQKAKDDDDSDIPTAQRKRFTRVE 

MARVLMERNQYKERLMELQEAVRWTEMIR 

ASRENPAMQEKKRSSIWQFFSRLFSSSSNTTK 

KPEPPV^KYNAPTSHVTPSVKJCRSSTLSQLP 

GDKSKAFDFLSEETEASLASRREQKREQYRQ 

VKAHVQKEDGRVQAFGWSLPQKYKQVTNG 

QGENKMKNLPVTVYLRPLDEKDTSMKLWCA 

VGVNLSGOKTRDGGSWGASVFYKDVAGLD 

TEGSKQRSASQSSLDKLDQELKEQQKELKNQ 

EELSSLVWICTSTHSATKVL1IDAVQPGNILDS 

FTVCNSHVLCIASVPGARETDYPAGEDLSESG 

QVDKASLCGSMTSNSSAETDSLLGGITVVGC 

SAEGVTGAATSPSTNGASPVMDKPPEMEAEN 

SEVDENVPTAEFAATEATEGNAGSAEDTWDIS 

QTG VYTEHVFTDPLGWQIPEDL SPVYQS SND 

SDAYKDQISVLPNEQDLVREEAQKMSSLLPT 

MWLGAQNGCLYVHSSVAQWRKCLHSIKLKD 

Sn,SIVHVKGrVLVALADGTLAIFHRGVDGQW 

DLSNYI ELLDLGRPI II ISIRCMTVVHDKVWCG 

YRNKIYVVQPKAMKIEKSFDAHPRKESQVRQ 

LAWVGDGVWV S I RLDSTLRL YHA HTYQHLQ 

DVDIEPYVSKMLGTGKLGFSFVRITALMVSC 
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nucleotide insertion 














NRLWVGTGNGVnSIPLTETVTLHQGRLLGLR 
ANKTSGVPGNRPGSVIRVYGDENSDKVTPGT 
FIPYCSMAHAQLCFHGHRDAVKFTVAVPGQV 
1SPQSSSSGTDLTGDKGRGHLHRSLWRRP 


805 


2155 


A 


6605 


469 


2602 


FGRLL WGT AFK SWKMKAPrPHLILL Y ATFTQ 

SLKVVTKRGSADGCTDWSIDIKKYQVLVGEP 

VRIKCALFYGYIRTNYSLAQSAGLSLMWYKS 

SGPGDFEEPIAFDG SRMSKEEDSIWFRPTLLQ 

DSGLYACVIRNSTYCMKVSISLTVGENDTGL 

C YNS KMK YFEKAELSKSKEI S CRDIEDFLLPT 

REPEILWYK^CRTKTWRPSIVFKRDTLLIREV 

REDDIGNYTCELKYGGFWRRTTELTVTAPL 

TDKPPKLLYPMESKLTIQETQLGDSANLTCRA 

FTGYSGDVSPLIYWMKGEKFIEDLDENRVWE 

SDIMCILKEHLGEQEVSISLIYDSVEEGDLGNYS 

CYVENGNGRRHASVLLHKRELMYTVELAGG 

1 a A IT I I 1 VP! VTI YK C YK TFI\4T FYRNT-TFtlA 

EELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGT 

YIEDVARCVDQSKRIJIVMTPNYVVRRGWSIF 

ELETRLRr^MLVTGEIKVILIECSELRGIMNYQE 

VEALKHTIKLLTVIKWHGPKCNFa.NSKFWKR 

LQYEMPFKRIEPrTHEQALDVSEQGPFGELQT 

VSAISMAAATSTALATAHPDLRSTFHNTYHS 

QMRQKHYYRSYEYDVPPTGTLPLTSIGNQHT 

YCNIPNTTLINGQRPQTKSSREQNPDEAHTNSA 

ILPLLPRETSISSVIW 


806 


2156 


A 


6614 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSAI , 

HSP AHRPPGFS VAQKPFG ATYV WS S I INTLQT 

Q VEVKKRRHRLKRHNDCFVG SEA VD VIFSHL 

1QNKYFGDVDIPRAKVVRVCQALMDYKVFE 

AVPTKVFGKDKJ^TFEDSSCSLYRFTTIPNQD 

SQLGKENKLYSPARYADALFKSSDIRSASLED 

I WFNT ST KPANSPHVNISTTLSPOVTNEVWOE 

ETIGRLLQLVDLPLLDSLLKQQEAVPKIPQPK 

RQSTMVNSSNYLDRGILKAYSDSQEDEWLSA 

AIDCLEYLPDQMWEISRSFPEQPDRTDLVKE 

LLFDAJGRYYSSREPLLNHLSDVHNGIAELLV 

NGKTEIALEATQLLLKLLDFQNREEFRRLLYF 

MAVAANPSEFKLQKESDNRMVVKRIFSKArV 

DNKNLSKGKTDLLVLFUMDHQKDVFKIPGT 

LVHKIVSWKILMAIQNGRDPNRDAGYIYCQRJ 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSA 

KEKKKVLLGQFYKCHPDIFIEI IFGD 


807 


2157 


A 


6615 


4198 


2094 


FGIVGTFALETDELDSDRDPAIFSLCDFGAMR 

PQILL1XALLTLGLAAQHQDK.VPCKM/VKML 

CPDRVDKKVSCQVLGLLQVPSVLPPDTETLD 

LSGNQLRSILASPLGFYTALRHLDLSTNEISFL 

QPGAFQALTHLEHLSLAHNRLAMATALSAG 

GLGPLPRVTSLDLSGNSLYSGLLERLLGEAPS 

LHTLSLAENSLTRLTRHTFRDMPALEQLDLHS 

NVLMDIEDGAFEGLPRLTHLNLSRNSLTCTSD 

FSLQQLRVLDLSCNSIEAFQTAS\QPQAEFQLT 

WL D L RENKLLHFP DL AALPRL IYLNLSNNL IR 

LPTGPPQDSKGIHAPSEGWSALPLSVAPSGNAS 

GRPLSQLLNLDLSYNEIELIPDSFLEHLTSLCFL 

NLSRNCLRTFEARRLGSLPCLMLLDLSHNALE 

TLELGARALGVSLRTLLLQGNALRDLPPYTFA 

NLASLQRLNLQGNRVSPCGGPDEPGP^5GCV\ 

AFSGITSLRSLSLVDNEIELLRAGAFLHTPLTE 
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LDLSSNPGLEVATGALGGLEASLEVLALQGN 

GLMVLQVDLPCFICLICRLNLAENRLSHLPAW 

TQ AV SLE VL DLRNN SFSLLPGS AMGGLETSLR 

RLYLQGNPLSCCGNGWLAAQLHQGRVDVDA 

TQDLICRFSSQEEVSLSHVRPEDCEKGGLKNI 

NLniLTFlLVSAJIXTTLAACCCVRRQKFNQQ 

YKA 


80S 


2158 


A 


6619 


153 


1852 


FKALSQYIYTNTHLEREAAFEVAILLRRMEEG 

ARHRNNTEKXHPGGGESDASPEAGSGGGGV 

ALKKEIGLVSACGIIVGNIIGSGIFVSPKGVLEN 

AGSVGLALIVWIVTGFITWGAJLCYAELGVNI 

PKSGGDYFYVKDIFGGLAGFLRLWIAVLVTYP 

TNQAVIALTFSNYVLQPLFPTCFPPESGLRLLA 

AICLLLLTWVNCSSVRWATRVQDIFTAGKLL 

ALALIIIMGI VQICK GE YFWIJEPKNAFENFQEP 

DIGLVALAFLQGSFAYGGWNFLNYWTEELV 

DPXYKNLXPRAinSIPNLVTFVYVFANV/ALYVT 

AMSPQEL\LAS\NAVAVTFGEKLLGVMAW1M 

PISVALSTFGGVNGSLFTSSRLFFAGAREGHLP 

SVLAMIHVKRCTPIPALLFTCISTLLMLVTSD 

MYTLINYVGFINYLFYGVTVAGQIVLRWKKP 

DIPRPEKINLLFPIIYLLFWAFLLVFSLWSEPW 

CGIGLAIMLTGVPVYFLGVYWQHKPKCFSDFI 

ELLTLVSQKMCVVVYPEVERGSGTEF.ANED 

MEEQQQPMYQPTPTKDKDVAGQPQP 


809 


2159 


A 


6621 


1041 


223 


QDSRKMLPSTSVNSLVQGNGVLNSRDAARH 

TAGAKRYKYLRRLFRFRQMDFEFAAWQMLY 

LFTSPQRVYRNFHYRK.QTKJDQWARDDPAFL 

VL LSIWLCVSTIGFGF VLDMGFFETIKLLL WV 

VLIDCVGVGLLIATLMWFISNKYLVKRQSRD 

YDVEWGYAFDVHLNAFYPLLVILHFIQLFFEN 

HVILTDTFIG YL VGNTL WLVA VG YYIYVTFL 

GYSVGLLFFSWLPFLKNTVILLYPFAPLILLYG 

LSLALGWNFTHTLCSFYKYRVK 


810 


2160 


A 


6623 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAA 
QQVAEDKFVFDLPDYESrNHVVVFMLGTTPFP 
EGMGGSVYFSYPDSNGMPVWQLLGFVTNGK 
PSAIFKJSGLKSGEGSQHPFGAMNIVRTPSVAQ 
IGISVELLDSMAQQTPVGNAAVSSVDSFTQFT 
QKMLDNFYNFASSFAVSQ/VPDDTQ/RPSEMF 
IPANVVLKWYENFQRRTSTEPSLLENIIWIKIN 
F 


811 


2161 


A 


6627 


18 


3367 


LEG SLHTERAKY YLTrTMPHFTVTKVEDPEEG 

AAASISQEPSLADDCARIQDSDEPDLSQNSITG 

EHSQLLDDGHKKARNAYLNNSNYEEGDEYF 

DKNXALFEEEMDTRPKVSSLLNRMANYTNLT 

QGAKEHEFAFNTTEGKXKPTKTPQMGTFMG 

VYLPCLQNTFGVILFLRLTWVVGTAGVLQAF 

ATVLICCCCTMLTAISMSAIATNGWPAGGSY 

FMISRALGPEFGGAVGLCFYLGTTFAAAMYTL 

GAIEIFLVY1VPRAA1FHSDDALKESAAMLNN 

MRVYGTAFLVLMVLVVFIGVRYVNKFASLFL 

ACV I V SILAI Y AG ADCSSF APPHFPVCMLGNRT 

LSSRHroVCSKTKEINNMTVPSKLWGFFCNSS 

QFFNATCDEYFVHNNVTSIQGIPGLASGirrEN 

LWSNYLPKGEIIEKPSAKSSDVLGSLNHEYVL 

VDrTTSFTLLVGrFFPSVTGIMAGSNRSGDLKD 

AQKSIPIGTILAILTTSFVYLSNVVLFGACIEGV 

VLRDKFGDAVKGNLWGTLSWPSPWVIVIGS 

FFSTCGAGLQSLTGAPRLLQAIAKDNIIPFLRV | 
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FGHSKANGEPTWALLLTAAIAELGILIASLDL 

VAPILSMFF1 -MCYI -FVNL AC ALQTLLRTPNW 

RPRFRYYHWALSFMGMSICLALMFISSWYYA 

IVAMVIAGMIYKYIEYQGAEKEWGDGIRGLS 

LSAARTALLRLEEGPPHTKN'WRPQLLVLLKL 

DEDLHVKHPRLLTFASQLKAGKGLTIVGSVIV 

GNFLENYGEALAAEQTIKHLMEAEICVXGFCQ 

LVVAAKLREG1SHLIQSCGLGGMKHNTVVM 

GWPNGWRQSEDARAWKTFIGTVRVTTAAHL 

ALLVAKNISFFPSNVEQFSEGNIDVWWIVHDG 

GMU^PFLLK\QHKVWRKCSIRFF\TVAQLE 

DNSIQMKKDL ATFL YHLRIEAEVEV VEMHD S 

D1S A YTYERTLMMEQR S QMLRHMRLSKTER 

DREAQLVKDRNSMLRLTSIGSDEDEETETYQ 

EKVHMTWTKJDKYMASRGQKAKSMEGFQDL 

LNMRPDQSNVRRMHTAVKLNEVIVNKSHEA 

KLVLLNMPGPPRNPEGDENYMEFLEVLTEGL 

ERVLLVRGGGSEVITIYS 


S12 


2162 


A 


6628 


66 


640 


AVCTMSEMAELSELYEESSDLQMDVMPGEG 

DLPQMEVGSGSRELSLRPSRSGAQQLEEEGP 

MEEEEAQPMAAPEGKRSLANGPNAGEQPGQ 

VAGADFESEDEGEEFDDWEDDYDYPEEEQLS 

GAGYRVSAALEEADKMFLRTREPALDGGFQ 

MHYEKTPFDQLAFIEELFNSLMVVNRLTEELG 

CDEIIDRE 


813 


2163 


A 


6630 


708 


1355 


AKMGAYKY1QELWRKKQSDVMRFLLRVRC 

WQYRQLSALHRAPRPTRPDKARRLGYKAKQ 

GY/VYIY1GFVFAVIYRIRVRRGGRKRPVPKG 

ATYGKPVHHGVNQLKFARSLQSVAEERAGR 

HCGALRVLNSYWVGEDSTYKFFEVILIDPFHK 

AIRRNPDTQ WITKP VHKHREMRGLTSAGRKS 

RGLGKGHKFHHT1GGSRRAAWRRRNTLQLH 

RYR 


814 


2164 


A 


6635 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLR 

DSEDRSDSRAAQPAHDSGHGDDESPSTSSGT 

AGTSSVPELPGFYFDPEKKRYFRLLPGHNNCN 

PLTKESIRQKJEMESKRLR1XQEEDRRKKXARM 

GFNASSMLRKSQLGFLNVTNYCHLAHELRLS 

CMERKKVQ1RSMDPSALASDRFNLILADTNS 

DRLFTVNDVTVGGSKYGIINLQSLKTPTLKVF 

MHENLYFTNRKVXNSVCWASLNHLDSHILLC 

LMGLAETPGCATLLPASLFVNSHPAGIDRPGN 

MLCSFRIPGAWSCAWSLNIQANNCFSTGLSR 

RVLLTN WTGHRQSFGTNS D VL AQQFALMA 

PLLFNGCRSGEIFAIDLRCGNQGKGWKATRLF 

HDSAVTSVRTLQDEQYLMASDMAGKIKLWD 

LRTTXCVRQYEGHVNEYAYLPLHVHEEEGn. 

VAVGQDCYTRJWSLHDARLLRTTPSPYPASKA 

DIPSVAFSSRLGGSRGAPGLLMAVGQDLYCY 

SYS 


815 


2165 


A 


6643 


659 


3282 


NKNILEVPSARTTRIMGDHLDLLLGWLMAG 

PVFGI PS CSFDGRI AFYRFCNLTQ VPQ VLNTTE 

R1XLSFNYIRTVTASSFPFLEQLQLLELGSQYT 

PLTIDKEAFRNLPNLRILDLGSSKTYFLHPDAF 

QGLFHLFELRLYFCGLSDAVLKDGYFRNLKA 

LTRLDLSKNQIRSLYLHPSFGKLNSLKSIDFSS 

NQIFLVCEHELEPLOGKTLSFFSLAANSLYSR 

VSVDWGKCMNPFRNMVLEILDVSGNGWTV 

DITGNFSNAISKSQAFSLrLAHHrMGAGFGFHN 

IKDPDQNTFAGLARSSVRHLDLSHGFVFSLNS 
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RVFETLKDLKVLNLAYNKJNKIADEAPYGLD 

NLQVLNLSYNLLGELYSSNFYGLPKVAYIDL 

QKNHIAirQDQTFKFLEKLQTLDLRDNALrriH 

FIPSIPDIFLSGNKLVTLPKINLTANLIHLSENR 

LENLDILYFLLRVPHLQIl.n.NQNRFSSCSGDQ 

TPSENPSLEQLFLGENMLQLAWETELCWDVF 

EGLSHLQVLYLNHNYLNSLPPGVFSHLTALR 

GLSLNSNRLTVLSHNDLPANLEILDISRNOLL 

APNPD VTVSLS VLDITHNKFICECELSTFIN WL 

NHTNVTIAGPPADIYCVYPDSLSGVSLFSLSTE 

GCDEEEVLKSLKFSLnVCTVTLTLFLMTILTV 

TKFRGFCFICYKTAQRLVFKDHPQGTEPDMY 

KYDAYLCFSSKDFTWVQNALLKHLDTQYSD 

QNRFNLCFEERDFVPGENRPVAN1QDAIWNSR 

KJVCLVSRHFLRDGWCLEAFSYAQGRCLSDL 

NSALIMVVVGSLSQYQLMKHQSIRGFVQKQQ 

YLRWPEDLQDVGWFLHKLSQQILKKEKEKK 

KDNNIPLQTVATIS 


816 


2166 


A 


6646 


1 


3811 


RDRAGVRPAGK.QHAAAAFYDVGGDRPWDS 

GNTQLPPRNPVKANAMFGAGDEDDTDFLSPS 

GGARLASLFGLDQAAAGHGNEFFQYTAPKQP 

KKGQGTAATGNQ ATPKT APATM STPTIL V AT 

AVHAYRYTNGQYVKQGKFGAAVLGNHTTR 

EYRILLYISQQQPVTVARIHVNFELMVRPNNY 

STFYDDQRQNWSIMFESEKAAVEFNKQVCIA 

KCNSTSSLDAVLSQDLIVADGPAVEVGDSLE 

VAYTGWLFQNHVLGQVFDSTANKDKLLRLK 

LGSGK.VIKGWEDGMLGMKKGGKRLLIVPPA 

CAVGSEGVIGWTQATDSILVFEVEVRRVKIA 

KDSGSDGHSVS SRDS AAPSPIPG ADNLS ADP V 

VSPPTSIPFKSGEPALRTKSNSLSEQLAINTSPD 

A VKAKL] SRMAKMGQPMLPILPPQLDSNDSEJ 

EDVNTLQGGGQPWTPSVQPSLQPAHPALPQ 

MTSQAPQPSVTGLQAPSAALMQVSSLDSHSA 

VSGNAQSFQPYAGMQAYAYPQASAVTSQLQ 

PVRPLYPAPLSQPPHFQGSGDMASFLMTEAR 

QHNTEIRMAVSKVADKMDHLMTKVEELQKH 

SAGNSMLIPSMSVTMETSMIMSNIQRIIQENER 

LKQEELEKSNRIEEQNDKISELIERNQRYVEQS 

NLMMEKRNNSLQTATENTQARVLHAEQEKA 

KVTEELAAATAQVSHLQLKMTAHQKKETEL 

QMQLTESLXETDLLRGQLTKVQAKLSELQET 

SEQAQSKFKSEKQNRKQLELKVTSLEEELTDL 

RVEKESLEKNLSERKKKSAQERSQAEEEIDEI 

RKSYQEELDKLRQLLKKTRVSTDQAAAEQLS 

LVQAELQTQWEAXCEHLLASAKDEHLQQYQ 

EVCA QRDA YQQKL VQT ,QEKS VCFA\CL ALQ A 

QITALTKQNEQHIKELEKNKSQMSGVEAAAS 

DPSEKVKKIMNQ VFQ SLRREFELEES YNGRTI 

LGTTMNTTKMVTLQLLNQQEQEKEESSSEEEE 

EXAEERPRRPSQEQSASASSGQPQAPLNRERP 

ESPMVPSEQWEEAVPLPPQALTTSQDGHRR 

KGDSEAEALSEIKDGSLPPFXSCIPSHRVLGPP 

TSIPPEPLGPVSMDSECEESLAASPMAAKVPDN 

PSGK\VCVREVAPDGPLQESSTRLSLTS\DPEE 

GDPLALGPESPGEPQPPQLKKDDVTSSTGPHK 

ELSSTEAGSTVAGAALRPSHHSQRSSLSGDEE 

DELFKGATLKALRPKAQPEEEDEDEVSMKGR 

PPPTPLFGDDDDDDDroWLG 


817 


2167 


A 


6649 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEG 



251 



Printed from Mimosa 03/03/06 11:11 32 Page: 252 



WO 01/57188 



PCT/liSOl/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A- Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, (Xjlycine, H-Histidine, 
I=Iso leucine, K=Lysine, L -Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=G!utamine, R^Arginine, S=Serine, 
T=Threonine, V=VaJine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/—possible nucleotide deletion, \—possible 
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KS/ARNSQLR1 VL VGKTG AGKS ATGN SILGRK 

VFHSG 1 AAKSITKKCEKRSSSWKETEL WVD 

TPGIFDTEVPNAETSKEIIRCILLTSPGPHALLL 

V\TLGRYTEEEHKATEKILKMFGERARSFMIL 

IFTRKDDLGDTNLHDYLRJEAPEDIQDLMDIFG 

DRYCALNNKATGAEQEAQRAQLLGLIQRW 

RENKEGCYTNRMYQRAEEEIQKQTQAMQEL 

1 IRVELEREKARIREE YEEKIRKLEDK VEQEKR 

KKQMEKKLAEQEAHYAVRQQRARTEVESKD 

GILELIMTALQIASF1LLRLFAED 


S18 


2168 


A 


6660 


357 


1890 


APSGSWTRVVLTLDPCSLRSRSPRSLLDPGMP 

GISARGLSHEGRKQLAVNLTRVLALYRSILDA 

Y1IEFPTDNLWDTLPCSWQEALDGLKPPQLA 

TMLLGMPGEGEVVRYRSVWPLTLLALKSTA 

CALAFTRMPGFQTPSEFLEKPSQSSRLTAPFR 

KHVRPKKQHE1RRLGELVKKLSDFT/GLHPGC 

RRGLRPGM ILSRFMALGLGLMVKSIEGDQRL 

VERAQRLDQELLQALEKEEKRNPQVVQTSPR 

HSPHHVVRWVDPTALCEELLLPLENPCQGRA 

RLLLTGLHACGVDLSVALLRHFSCCPEVVALA 

SVGCCYMKLSDPGGYPLSQWVAGLPGYELP 

YRLREGACHALEEYAERLQKAGPGLRTHCY 

RAALETVIRRARPELRRPGVQGIPRVHELKIEE 

YVQRGLQRVGLDPQLPLNLAALQAHLAQEN 

RVVAFFSLALLLAPLVETLILLDRLLYLQEQA 

LSP\GFHAELLPIFSPELSPRNLVLVATKMPLG 

QALSVLETEDS 


819 


2169 


A 


6661 


65 


2686 


SGSGHCLAEAASMGPWGWKLRWTVALLLA 

AAGTAVGDRCERNEFQCQDGKCISYKWVCD 

GSAECQDGSDESQETCLSVTCKSGDFSCGGR 

VNRCIPQFWRCDGQVDCDNGSDEQGCPPKTC 

SQDEFRCHDGKCISRQF VCDSDRDCLDG SDE 

ASCPVLTCGPASFQCNSSTCIPQLWACDNDPD 

CEDGSDEWPQRCRGLYVFQGDSSPCSAFEFH 

CLSGECIHSSWRCDGGPDCKDKSDEENCAVA 

TCRPDEFQCSrXjNCIHGSRQCDREYDCKDMS 

DE V GC VNVTLCEGPNKFKCHSGECITLDKVC 

NMARDCRDWSDEPDCECGTNECLDNNGGCS 

IIVCNDLKIGYECLCPDGFQLVAQRRCEDIDE 

CQDPDTCSQLCVNLEGGYKCQCEEGFQLDPH 

TKACKAVGSIAYLFFTNRHE\TUCMTLDRSEY 

TSLIPNLRNVVALDTEVASNRTYWSDLSQRMI 

CSTQLDRAHGVSSYDTVISRDIQAPDGLAVD 

WIHSNIYWTDSVLGTVSVADTKGVKRKTLFR 

ENGSKPRATVVDPVHGFMYWTDWGTPAKIK 

KGGLNGVDIYSLVTENIQWPNGITLDLLSGRL 

YWVDSKLHSISSIDVNGGNRKTILEDEKRJLAH 

PFSLAVFEDKVFWTDITNEAIFSANRLTGSDV 

NLLAENLLSPFJDMVLFHNLTQPRGVNWCERT 

TLSNGGCQYLCLPAPQINPHSPKFTCACPDGM 

LLAR\DMRSCLTEG\EAAVATQETSTVRLKVS 

STAVRTQHTTTRPVPDTSRLPGATPGLTTVEI 

VTMSHQALGDVAG\RGN\EKKPSSVRALSrVL 

PrV\LLVTlXLGVFLLWKNWRLKNINSrNFDNP 

VYQKTTEDEVfflCHNQDGYSYPSRQMVSLED 

DVA 


820 


2170 


A 


6666 


17 


4146 


ERGISSQIKGMKSGSGGGSPTSLWGLLFLSAA 
LSLWPTSGEICGPGIDIRNDYQQLKRLENCTVI 
EGYLHILLISKAEDYRSYRFPKLTVITEYLIXF 
RVAGLESLGDLFPNLTVIRGWKLFYNYALVIF 
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82] 












EMTNLKDIGLYNLRNITRG\AIRIEKNADLCYL 

STVDWSLILDAVSNNYIVGNKPPKECGDLCP 

GTMEE KPMC EKTT INNEYNYRC WTTNRCQK 

MCPSTCGKRACTENNECCHPECLGSCSAPDN 

DTACVACRHYYYAGVCVPACPPNTYRFEGW 

RC VDRDFCAN I L S AES SDSEGF VIHDGECMQE 

CPSGFIRNGSQSMYCIPCEGPCPKVCEEEKKT 

KTIDSVTSAQMLQGCTIFK.GNLLINIRRGNNIA 

SELENFMGLIEWTGYVKIRHSHALVSLSFLK 

NLRLILGEEQLEGNYSFYVLDNQhTLQQLWD 

WDHRNLT1KAGKMYFAFNPKLCVSEIYRMEE 

VTGTKGRQSKGDINTRNNGERASCESDVLHF 

TSTTTSKNRJUITWHRYRPPDYRDLISFTVYYK 

EAPFKN VTE YDG QDACGSNS WNNTVD VDLPP 

NKDVEPGILLHGLKPWTQYAVYVKAVTLTM 

VENDHIRGAKSEILYIRTNASVPSIPLDVLSAS 

NSSSQLI VK WNPPS LPNGNL S YYrVRWQRQP 

QDGYLYRHNYCSKDKIPIRKYADGTIDIEEVr 

ENPKTEVCGGEKGPCCACPKTEAEKQAEKEE 

AEYRK VFENFLHN SIFV PRPERKRRD VMQ V A 

NTTMSS R SRNTT AADTYNITDPEELETE YPFF 

ESRVTiNKERTVISNLRPFTLYRIDIHSCNHEAE 

KLGCSASNFVFARTMPAEGADDIPGPVTWEP 

RPENSIFLKWPEPENPNGLILMYEIKYGSQVE 

DQRECVSRQEYRKYGGAKLNRLNPGNYTARI 

QATSLSGNGSWTDPVFFYVQAKRYENFIHLII 

ALPVAVLLIVGGLVTMLYVFHRKRNNSRLGN 

G VL Y AS VNPEYF SAADVYVPDE WEV AREKIT 

MSRELGQGSFGMVYEGVAKGVVXDEPETRV 

AIKTVTsfEAASMRERIEFLNEASVMKEFNCHH 

WRLLGWSQGQPTLVIMELMTRGDLKSYLR 

SLRPEMENNPVLAPPSLSKMIQMAGEIADGM 

AYLNANKFVHRDLAARNCMVAEDFTVKIGD 

FGMTRDIYETDYYRKGGKGLLPVRWMSPESL 

KDGVFTTYSDVWSFGVVLWEIATLAEQPYQ 

GLSNEQVLRFWMEGGLLDKPDNCPDMLFEL 

MRMCWQYNPKMRPSFLEirSSIKEEMEPGFRE 

VSFYYSEENKLPEPEELDLEPENMESVPLDPS 

AS SSSLPLPDRHS GHKAENGPGPG VLVLRASF 

DERQPYAHMNGGRKNERALPLPQSSTC 


2171 


A 


6691 


106 


825 


GRVLFRGCGVGHKGQVLMGTFILAQDWLSE 

SNHVFCVSSMLRLQKRLASSVLRCGKKKVAV 

LDPNETNEIANANSRQQIRK-LIKDGLIIRKPVT 

VH SRARCRKNTL ARRKGRHMGIGKRKGT AN 

ARMPEKVTWMRRMRILRRLLRRYRES7KRYR 

ESKXIDRHMYHSLYLKVXGNVTKNKRILMEH 

IHKJjKADKARKKLLADQAEARRSKTKEARK 

RREERLQAKKEEUKTLSKEEETICK 


822 


2172 


A 


6715 


772 


21 


DFKPGLLLPRKJCKMFGFHKPKMYRSIEGC\CI 

SGAKSSSS\RFTDSKRYEK>DFQ\SCFGLHEm\ 

SGDI\CNA\CVLL\LKRWKKLPAGSKK\NWNH 

WDARAGPS\LKTTLKPKKVKTL\SGNRIK\ST 

QISKLQKBFKR\HNSDAHSTTS\SASP\AQSPLF 

TVNQFRWTGSDTGVGFPGSNRNHPVFSFLDLV 

TYWKRQKJCCGIXIYKGRFGEVLIDTHLFKPCC 

SNKKA\AAEKPEEQGPEPLPISTQEWVTEVFM 


823 


2173 


A 


6727 


3 


4063 


PYLATLQLDSSLLrPPKYQTPPAAAQGQATPG 
NAGPLAPNGSAAPPAGSAFNPTSNSSSTNPAA 
SSSASGSSVPPVSSSASAPGISQISTTSSSGFSGS 
VGGQNPSTGGISADRTQGNIGCGGDTDPGQS 
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SSQPSQDGQESNVPSVGSLADPDYLNTPQMK 

TPVTLNSAAPASNSGAGVLPSPATPRFSVPTP 

RTTRTPRTPRGGGTASGQGSVKYDSTDQGSP 

ASTPSTTRPLNSVEPATMQPIPEAHSLYVTLIL 

SDSVMNTFKDRNFDSCC1CACNMNIKGADVG 

LYEPDSSNEDQYRCTCGFSAIMNRKLGYNSGL 

FLEDELDIFGKNSDIGQAAERRLM\MCQSTFL 

PQVEGTKKPQEPP1SLLLLLQNQHTQPFASLN 

FLDYISSNNRQTLPCVSWSYDRVQADNNDY 

WTECFNALEQGRQYVDNPTGGKVDEALVRS 

ATVHSWPHSNVLDISMLSSQDWRMLLSLQP 

FLQDAIQKKRTGRTWENIQHVQGPLTWQQFH 

KMAGRGTYGSEESPEPLPIPTLLVGYDKDFLT 

ISPFSLPFWERLLLDPYGGHRDVAYTWCPEN 

EALLEGAKTFFRDLSAVYEMCRLGQHKPICK 

VLRDGLMRVGKTVAQKLTDELVSEWFNQPW 

SGEENDNHSRLKLYAQVCRHHLAPYLATLQL 

DSSLLEPPKYQTPPAAAQGQATPGNAGPLAPN 

GSAAPPAGSAFNPTSNSSSTNPAASSSASGSSV 

PPVSSSASAPG1SQISTTSSSGFSGSVGGQNPST 

GGISADRTQGNIGCGGDTDPGQSSSQPSQDG 

QESVTERERIGIPTEPDSADSHAHPPAVVrYM 

VDPFTYAAEEDSTSGNFWLLSLMRCYTEMLD 

NLPEHMRNSFILQIVPCQYMLQTMKDEQVFY 

IQYLKSMAFSVYCQCRRPLPTQIHIKSLTGFGP 

AA S I EMTL KNP ERPSPIQLYSPPFI LAP IKDKQT 

ELGETFG E A SQK YNVLF VG YCLSHDQRWLL 

ASCTDLHGELLETCVVNIALPNRSRRSKVSAR 

LGHGELKDWSILLGECSLQTISKKLKDVCRM 

CGISAADSPSILSACLVAMEPQGSFWMPDAV 

TMGSVFGRSTALNMQSSQLNTPQDASCTHIL 

VFPTSSTIQ VAPANYPNEDGF SPNNDDMFVDL 

PFPDDMDNDIGILMTGNLHSSPNSSPVPSPGSP 

SGIGVGSHFQHSRSQGERLLSREAPEELKQQP 

LALGYFVSTAKAENLPQWFWSSCPQAQNXQC 

PLFLKASLHHHISVAQTDELLPARNSQRVPF1P 

LDSKTTSD VLRFVLEQ YNALS WLTCNPATQD 

RTSCLPVI IFWLTQL YNAIMNTL 


824 


2174 


A 


6732 


2440 


365 


VEEGLGRRRTPPGGRRGPVTPARPGPDSVRR 

RLLPPS S AAAFS SHRHNLLC SRRRGGGGGGG 

GGGGGTIKRPGITGPTAATSPSGEPGNAASAP 

LSLLSPFPGQTTYQHPGVAEPSAYGGRDVAC 

ASLVFGRLQHRGGDRKRGLLGRSSGDAASD 

QPFRCRSGSTAGRLVKQMDFTEAYADTCSTV 

GLAAREGNVKVLRKLLKKGRSVDVADNRG 

WMPIHEAAYHNSVECLQMLINADSSENYIKM 

KTFEGFCALHLAASQGHWKIVQILLEAGADP 

NATTLEETTPLFLAVENGQIDVLRLLLQHGAN 

VNGSHSMCGWNSLHQASFQENAEIIKLLLRK 

GANKECQDDFGITPLFVAAQYG\KLESIASILIS 

SG\ANVNCQALDKATPLFIAAQEGHTKCVELL 

LS SG ADPDL YCNEDS WQLPIHAAAQMGHTKJ 

LDLLIPLTNRACDTGLNKVSPVYSAVFGGHE 

DCLEILLRNGYSPDAQACLVFGFSSPVCMAFQ 

KDCEFFGIVNILLKYGAQINELHLAYCLKYEK 

F SIFRYFLJIKGC SLGPWNFfl YEFVNHAIKAQ A 

KYKEWLPHLLVAGFDPLILLCNSWIDSVSrDT 

LIFTLEFTNWKTLAPAVERML SARA SNA WIL 

QQHIATVPSLTHLCRLEIRSSLKSERLRSDSYIS 
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QLPLPRSLHNYLLYEDVLRMYEVPELAAJQD 
G 


825 


2175 


A 


6735 


277 


1252 


RIMGLPDRGVQMLLTTVGAPAAFSLMTIAVG 

TDYWLYSRGVCKTKSVSENETSKKNEEVMT 

HSGLWRTCCLEGNFKGLCKQIDHFPEDADYE 

ADTAEYTLRAVRAS S IFP1LS V1LLFMGGLCI A 

ASEFYKTRHNlILSAGIFFVSAGLSNTIGnVYIS 

ANAGDPSKSDSKKNSYSYGWSFYFGALSFIJA 

EMVGVLAVHMFIDRHKQLRATARA\TDYLQ 

ASAJTRIPSYRYRYQRRSRSSSRSTEPSHSRDA 

SPVGDCGFNTLPSTEISMYTLSRDPLKAATTPT 

ATYNSDRDNSFLQVHNCIQKENKDSLHSNTA 

NRRTTPV 


826 


2176 


A 


6744 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNE 

TEDCPGMMLWRYPEPRGLTLVRITPVPFNTT 

EDPDISTADLGDVLQDPCSLEYWDELQKVFV 

AFREFNLSESKVCELQLPDINLVNDQKKLVSS 

DLWRTVLNSSQNGADDQSSASESGSQSTCDPL 

V7PTALAACTRVDSCFTPWFVPSLCVSFQFAH 

LEFHLCHHLDQLGTAAPQYLQPFVSDRNMPS 

ELEYMrVSFREPHMYLRQWNNGSVCQEIQFL 

AQADCKLLECRNVTMQSVVKPFSIFGQMAVS 

SDVVEK.LLDCTVIVDSVFVNLGQHWHSLNT 

AIQAWQQNKCPEVEELVFSHFVICNDTQETL 

RFGQ VDTDENILLA SLHSHQYS WRSHKSPQL 

LHICIEGWGNWRWSEPFSVDHAGTFIRTIQYR 

GRTASLIIKVQQLNGVQKQIIICGRQIICSYLSQ 

SIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLE 

SKAPEYSIVIQVPSSNSSIIYVWCTVLTLEPNS 

QVQQRMIVFSPLFIMRSHLPDPIIIHLEKRSLGL 

SETQHPGKGQEKPLQNIEPDLVHHLTFQAREE 

YDPSDCAVPISTSLIKQIATKVHPGGTVNQrLD 

EFYGPEKSLQPIWPYNKKDSDRNEQLSQWDS 

PMRVKLSIWKPYVRTLLIELLPWALLINESKW 

DLWLFEGEKJVLQVPAGKI1IPPNFQEAFQIGIY 

WANTNTVHKSVAIKLVHNLTSPKWKDGGNG 

EVVTLDEEAFVDTEIRJLGAFPGHQKLCQFC1S 

SMVQQGIQIIQIEDKTTnNNTPYQIFYKPQLSV 

CNPHSGKEYFRVPDSATFS1CPGGEQPAMKSS 

SLPCWDLMPDISQSVLDASLLQKQIMLGFSPA 

PGADSSQCWSLPAIVRPEFPRQSVAVPLGNFR 

ENGFCTRAIVLTYQEHLGVTYLTLSEDPSPRV 

IIHNRCPVKMLIKEN1KD1PKFEVYCKK1PSECS 

MHELYHQISSYPDCKTKDLLPSLLLRVEPLDE 

VTTEWSDAIDINSQGTQVVFLTGFGYVYVDV 

VHQCGTVFTTVAPEGKAGPILTNTNRAPEKJV 

TF/KMFTTQLSLAVFDDLTHHKASAELLRLTL 

DNIFLCVAPGAGPLPGEEPVAALFELYCVEIC 

CGDLQLDNQLYNKSNFHFAVLVCQGEKAEPI 

QCSKMQSLLISNKELEEYKEKCFIKLCITLNEG 

KSILCDINEFSFELKPARLYVEDTFVYYIKTLF 

DTYLPNSRLAGHSTHLSGGKQVLPMQVTQH 

ARALVNPVKLRKLVIQPVNLLVSIHASLKLY1 

ASDHTPLSFSVFERGPIFTTARQLVHALAMHY 

AAGALFRAGWWGSLDILGSPASLVRStGNG 

VADFFRLPYEGLTRGPGAFVSGVSRGTTSFVK 

HIS KGTLTS ITN LAT S LARNM D RLSL DEEFTYN 

RQEEWRRQLPESLGEGLRQGLSRLGISLLGAI 

AG1VDQPMQNFQKTSEAQASAGHKAKGVISG 



255 



Printed from Mimosa 03/03/06 11:1 1:35 Page: 256 



WO 01/57188 



PCT/USO 1/03800 



SHQ m 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E-Glutaraic Acid, 
F^Phenylalanine, G=Glycine, H=Histidine> 
I=Iso leucine, K=Lysine, L=Leucine, 
M-Mcthionine, N = Asp ar agin e, P-Proline, 
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VGKGIMOVFTKPIGGAAELVSQTGYGILHGA 

GLSQLPKQRHQPSDWHADQAPNSHVKYVW 

KMLQSLGRPEVHMALDWLVRGSGQEHEGC 

LLLTSEVLFWSVSEDTQQQAFPVTErDCAQD 

SKQNNLLTVQLKQPRVACDVEVDGVRERLSE 

QQYNRLVDYITKTSCHLAPSCSSMQIPCPWA 

AEPPPSTVXTYHYLVDPHFAQVFLSKFTMVK 

NKALRKGFP 




2177 


A 


6748 


2 


1662 


FVGAPRRGNPFGSPGNPGRHQGPCHRPRGTK 

A SG V SPTLWRPQ AAATGLEMPS SGRALLDSP 

LDSGSLTSLDSSVFCSEGEGEPLALODCFTVN 

VGGSRFVLSQQALSCFPHTRLGKLAVVVASY 

RRPGALAAVPSPLELCDDANPVDNEYFFDRS 

SQAFRYVLHYYRTGRLHVMEQLCALSFLQEI 

QYWGIDELSIDSCCRDRYFRRKELSETLDFKK 

DTEDQESQHESEQDFSQGPCPTVRQKLWNIL 

EKPGSSTAARIFGVISirFVGVSirNMALMSAEL 

SWLDLQLLEILEYVCISWFTGEFVLRFLCVRD 

R CRFLRK VPNTTDL LA U .PFY1TLLVESLSG\SQT 

TQEL\ENVGAHCPGCLRLLRAL\RMLKAWGR 

HSTGLRSLGMTITQCYEE VGLLLLFLS VGI SIF 

STVEYFAEQSIPDTTFTSVPCAWWWATTSMT 

TVGYGD1RPDTTTGKTVAFMCILSG1LVLALP1 

A1INDRFSACYPTLKLKEAAVRQREALKKLTK 

NIATDSYISVNLRDVYARSIMEMLRLKGRER 

ASTRSSGGDDFWF 


828 


2178 


A 


6786 


5672 


1360 


GTFiPASSGPVPLPPAAVSAATREELGEPVPFV 

TASSGFQSMHSSNPKVRSSPSGNTQSSPKSKQ 

EVMVRPPTVMSPSGNPQLDSKFSNQGKQGGS 

ASQSQPSPCDSKSGGHTPKALPGPGGSMGLK 

NGAGNG AKGKGKRERS rS ADSFDQRDPGTPN 

DDSDIKECNSADHIKSQDSQHTPHSMTPSNAT 

APRSSTPPHGQTT ATEPTPAQKTP AKVVYVF S 

TBMANKAAEAVLKGQVETTVSFH1QNISNNK 

TERSTAFLNTQISALRNDPKPLPQQPPAPANQ 

DQNS SQNTRLQPTPPIPAPAPKPAAPPRPLDRE 

SPGVENKLIPSVGSPASSTPLPPDGTGPNSTPN 

NRAV1PVSOGSNSSSADPKAPPPPPVSSGEPPT 

LGENPDGLSQEQLEHRERSLQTLRDIQRMLFP 

DEKEFTGAQSGGPQQNPGVLDGPQKKPEGPI 

QAMMAQSQSLGKGPGPRTDVGAPFGPQGHR 

DVPFSPDEMVPPSMNSQSGTIGPDHLDHMTP 

F.Q1 AW1 XLQQRFYEEKRRKPKQWVQQCSLQ 

DMMVHQHGPRGWRGPPPPYQMTPSEGWAP 

GGTEPFSIXJINMPHSLPPRGMAPHPNMPGSQ 

MRLPGFAGNCNSEMEGPNVPNPASRPGLSGV 

SWPDDVFKIPDGRNFPPGOGIFSGPGRGERFP 

NPQGLSEEMFQQQLAEKQLGLPPGMAMEGIR 

PSMEMNKMIPGSQRHMEPGNNPIFPRIPVEGP 

LSPSRGDFPKGIPPQMGPGRELEFGMVPSGM 

KGDVNLNVNMGSNSQMIPQKMREAGAGPEE 

MLKLRPGG SDMLP AQQKMVPLPFGEHPQQE 

YGMGPRPFLPMSQGPGSNSGLRNLREPIGPDQ 

RTNSRLSHMPPLPLNPSSNPTSLNTAPPVQRG 

LGRKPLDISVAGSQVHSPGINPLKSPTMHQVQ 

SPMLGSPSGNLK.SPQTPSQLAGMLAGPAAAA 

SIKSPPVLGSAAASPVHLKSPSLPAPSPGWTSS 

PEPPLQSPG1PPNI IKAPLTMASP AMLGNVESG 

GPPPPTASQPASVNIPG\SLPSSTPYTMPPEPTL 

SQNPLSIM\MSR\MSKPAJ^PS\SNPGyNHDAl 
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KTVASSDDDSPPARSPNLPSMNNMPGMGINT 

QNPRISGPNPWPMFI'LSPMGMTQPLSHSNQ 

MPSPNAVGPNIPPHGVPMGPGLMSHNPIMGH 

G SQEPPMVPQGRMGFPQGFPP VQSPPQQ VPFP 

HNGPSGGQGSFPGGMGFPGEGPLGRPSNLPQ 

SSADAALC3CPGGPGGPDSFTVLGNSMPSVFT 

DPDLQEV 1 RPG ATGIPEFDLSRJ IPSEKPSQTLQ 

YFPRGEVPGRKQPQGPGPGFSHMQGMMGEQ 

APRMGLALPGMGGPGPVGTPDIPLGTAPSMP 

GHNPMRPPAFLQQGMMGPHHRMMSPAQST 

MPGQ PTLM SNPAAA VGMIPGKDRGPAGL YT 

HPGPVGSPGMMMSMQGMMGPYNRTS 


829 


2179 


A 


6797 


433 


3 


ASFFNFSICICKIILEVGPPVGHPAHDDVGGRH 

GPGGR/GSRSPRSLQCAPGGGRRSGCPAGSSP 

ASTCPPSPGGSGADRFGPSPPPPSREAAPTAG 

AAASSTSSGASCPPVPASSRWGVRSRTRSGSG 

GEREPRDRPSERPRLV 


830 


2180 


A 


6800 


3 


1911 


LPERAFGPRTFRAPRRRRRRLLLSPPPRPPPPL 

DREPRAPGPWLCPSRAGTAQDPARIRERRGR 

VAGGAAGPAMELRARGWWLLCAAAALVAC 

ARGDPASKSRSCGEVRQIYGAKGFSSS\DVPQ 

AEISGEHLRJCPQGYTCCTSEMEENLANRSHA 

ELETALRDS SR VLQAMLATQLR SFDDHFQHI . 

LNDS ERTLQ ATFPG AFGELYTQN ARAFRDL Y 

SELRLYYRGANLHLEETLAEFWARLLERLFK 

QLHPQLLLPDDYLDCLGKQAEALRPF\GEAP\ 

RELRLRATRAVFVAARVSFVQGLGVASVDWR 

KVAOVPt^iXPEfASRAVIEAGSYC/ALHCVGVP 

G ARPCPD YCRNVLKGCLANQ ADI ,D AEWRNL 

LDSMVLITDKFWGTSGVESVIGSVHTWLAEA 

INALQDNRDTLTAKVIQGCGNPKVNPQGPGP 

EEKRRRGKLAPRERPPSGTLEKLVSEAKAQL 

RDVQDFWISLPGTLCSEKMALSTASDDRCWN 

GMARGRYLPEVMGDGLANQINNPEVEVDIT 

KP D MTIRQQ IMQLKIMTNRLR S A YNGNDV DF 

QDASDDGSGSGSGDGCLDDLCGRKVSRKSSS 

SRTPLTHALPGLSEQEGQKTSAASCPQPPTFL 

LPLLLFLALTVARPRWR 


831 


2181 


A 


6808 


2 


1522 


A SRHGMTPG ALLMLLG ALGPPL APG VRG SEA 

EGRLREKLFSGYDSSVRPAREVGDRVRVSVG 

LILAQLISLNEKDEEMSTKVYLDLEWTDYRLS 

WDPAEHDGIDSLR1TAJESVWLPDVVLLNNND 

GNFDVALDISWVSSDGSVRWQPPGIYRSSCS 

IQVTYFPFDWQNCTMVFSSYSYDSSEVSLQT 

GLGPDGQGHQEIHIHEGTFIENGQWENIHKPS 

RLIQPPGDPRGGREGQRQEVTFYLIIRRKPLFY 

LVNV1APCII ,ITLLAIFVFYLPPD AGEKMGI ,SIF 

ALLTLTVFLLLLADKVPETSLSVPIIIKYLMFT 

MVLVTFSVILSVWLNLHHRSPHTHQMPLWV 

RQIFIHKLPLYLRLKRPKPERDLMPEPPHCSSP 

GSGWGRGTDEYFIRKPPSDFLFPKPNRFQPEL 

SAPDLRRFIDGPNRAVALLPELREWSSISY1A 

RQLQEQEDHDALKEDWQFVAMWDRLFLW 

TFIIFTSVGTLWIFLDATYHLPPPDPFP 


832 


2182 


A 


6824 


71 


1079 


ETMAKNPPENCEDCHILNAEAFKSKKICKSLK 
I CGLVFGILALTLI VLFWGSKHF WPEVPKKA Y 
DMEHTFYSNGEKKKIYMEIDPVTRTEIFRSGN 
GTDETLEVHDFKNGYTGIYFVGLQKCFIKTQr 
KVIPEFSEPEEEIDENEE1TTTTF EQSVIWVPAE 
KPIENRDF1.KNSKILEICDNVTMYW\INPTL\IS 
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GTFAKQLHHNFAFIILVSELQDFEEEGEDLHFP 
ANEKKGI EQNEQWWPQ VK VEKTRHARQ A S 
EEELPINDYTENGIEFDPMLDERGYCCIYCRR 
GNRYCRRVCEPLLGYYPYPYCYQGGRVICRV 
IMPCNWWVARMLGRV 


833 


2183 


A 


6846 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGP 
G VMESKEERALNNL I VENVNQENDEKDEKE 
Q VANK.GEPL ALPLNV SEYC VPRGNRRRFRVR 
QPILQYRWDIMHRLGEPQARMREENMER1GE 
EVRQLMEKLREKQLSHSLRAVSTDPPHHDHH 
DEFOLMP 


834 


2184 


A 


6851 


3 


2024 


PNGVALLHI .PG AAVIPNTNYMFQDALGGRSR 

GSREESPAPSRAPASASLWRRLVWEAKMAA 

HAAAAAQAAAAQ AAHAEAAD S W YLALLGF 

AEHFRTSSPPKJRLCVHCLQAVFPFKPPQPJEA 

RTHLQLGSVLYHHTKNSEQARSHLEKAWL1S 

QQIPQFEDVKFEAASLLSELYCQENSVDAAXP 

LLRKAIQISQQTPYWHCRLLFQLAQLHTLEKD 

LVSACDLLGVGAEYARWGSEYTRALFLLSK 

GMLLLMERKLQEVHPLLTLCGQIVENWQGN 

PIQKESLRVFFLVLQVTHYLDAGQVKSVKPC 

LKQLQQCIQT1STLMDDEILPSNPADLFHWLP 

KFHMCVLVYLVTVMHSMOAGYLFKAOKYT 

DKALMQLEKLFOvTLDCSPII.SSFQVILLEKIIM 

CRL VTGHKAT ALQEI SQVCQLCQQSPRLFSN 

HAAQLHTLLG L YC VS VNCMDNAEAQFTTAL 

RLTNHQELWAFIVTNLASVYIREGNRHQEWX 

L YSLLERINPD H S FP VS S HCLRAAAF YVRGLF 

SFFQGRYNEAKRFLRETLKMSNAEDLNRLTA 

CST , VLLGH IF YVL GNHRE SNNMVVP AMQL AS 

KfPDMSVQLWSSALLRDLNKACGNAMDAHE 

AAQMHQNFSQQLLQDHIEACSLPEHNLITWT 

DGPPPVQFQAQNGPNTSLASLL 


835 


2185 


A 


6855 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCL 

SVSGIGGFLVSLSSRMKLQTLAVSVTALKFWS 

AYVPCQTQDRDALRLTLEQIDLIRRMCASYSE 

LELVTSAKA1>TDTQKJLACLIGVEGGHSLDNS 

LSILRTFYMLGVRYLTLTHTCNTPWAESSAK 

GVHSFYNNISGLTDFGEKVVAEMNRLGMMV 

DLSHVSDAVARRALEVSQAPVIFSHSAARGV 

CNSARNVPDDILQLLEEERWAFVMVSLFHGE 

LIQWQPIRPMCSTVADHFDHIKAVMGSKFTGI 

GGDYDGAGKYRKKTTCKAP WRTS SRMSS 


836 


2186 


A 


6862 


315 


11 


PPRSRPSCWRKKVGPGRPWWWGGTGPPGQG 
RPEIRLLPLPMTGACGAVAASRTGSSGPG/SSL 
PNGHGGKGSGLANGLAGNP\GHLGLGSSFGT 
GPGSGRPPP 


837 


2187 


A 


6863 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTT 

APFPGLVQRRSRLL1VSQVRYFLKNKVSPDLC 

NEDGLTALHQCCIDNFEEIVKLLLSHGANVN 

AKDNELWTPLHAAATCGHINLVKILVQYGA 

DLLAVNSDGNMPYDLCEDEPTLDVIETCMAY 

QGITQEKINEMRVAPEQQMIADIHCMIAAGQ 

DLDWIDAQGATLLHIAGANGYLRAAELLLDH 

GVRVDVKDWDGWEPLHAAAFWGQMQMAE 

LLVSHGAN\LNARTSMDEMPIDLCEEEEFKVL 

LLELK\HKHDVIMKSQLRHKSSLSRRTSHRQA 

S/SVGKWRRTQPVGTGPNL\YRKE YE/GEE AI 

LWQRSA\AEDQRTSTYNGDIR£*T\RTDQENKD 
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Y-Tyrosine, X—Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














PNPRLEKWLLSEFPTKIPRGELDMPVENGLR 

APVSAYQYALANGDVWKVHEVPDYSMAYG 

NPGVADATPPWSSYKEQSPQTLLELKLRQRAA 

AKLLSHPFLSTHLG SSMARTGESS SEGKAPLI 

GGRTSPYSSNGTSVYYTVTSGDPPLLKFKAPI 

EEMEEKVHGCCRIS 


838 


2188 


A 


6865 


6291 


739 


AGPLEPRVQGAMALQLWALTLLGLLGAGAS 

LRPRKLDFFRSEKELNHLAVDEASGWYLGA 

VNALYQLDAKLQLEQQVATGPVLDNKKCTP 

PIE ASQC HEA EMTDNVNQLL LVDPP RKRL V E 

CGQLLKGI\CALRAL SNI SLRLFYEDGSGEKSF 

VASNDEGVATVGLVSSTGPGGDRVLFVGKG 

NGPHDNGITVSTRLLDRTDSRE AFE A YTDHAT 

YKAGYLSTNTQQFVAAFEDGPYVFFVTNQQD 

KHPARNRTLLARMCREDPNYYSYLEMDLQC 

RDPDIH A AAFGTCLAA SV AAPGSGRVL Y A VF 

SRDSRSSGGPGAGLCLFPLDEVHAKMEANRN 

ACYTGTREARDIFYKPFHGDIQCGGHAPGSSK 

SFPCGSEHLPYPLGSRDGLRGTAVLQRGGLN 

LTAVTVAAENNHTVAFLGTSDGRJLKVYLTP 

DGTSSEYDSILVEINKRVKRDLVLSGDLGSLY 

AMTQDKVFRLPVQECLS YPTCTQCRDS QDPY 

CG WC V VEGRCTRKAECPRAEEASH WL WSRS 

KSCVAVTSAQPQNMSRRAQGEVQLTVSPLPA 

LSEEDELLCLFGESPPHPARVEGEAVICNSPSS 

IPVTPPGQDHVAVTIQLLLRRGNIFLTSYQYPF 

YDCRQAMSLEENLPCISCVSNRWTCQWDLR 

YHECPJ3ASPNPEDG I VRAHMEDSCPQFLG PSP 

LVlPMNHETDVNFQGKNLDTVKGSSt-HVGSD 

LLKFMEPVTMQESGTFAFRTPKLSHDANETL 

PLHLYVKSYGKNIDSKLHVTLYDCSFGRSDC 

SLCRAANPDYRCAWCGGQSRCVYEALCNTT 

SECTPPVTrPJQPETOPLGGGIRrTTLGSNLGVQ 

AGDIQRJSVAGRNCSFQPERYSVSTRIVCVIEA 

AFTPFTGGVEVDVFGKLGRSPPNVQFTFQQP 

KPLSVEPQQGPQAGGTTLTIHGTHLDTGSQED 

VRVTLNGVPCKVTKJ-GAQLOCVTGPQATRG 

QMLLEVSYGGSPVP'NPGIFFTYRENPVLRAFE 

PLRSFASGGRSINVTGQGFSLIQRFAMVV1AEP 

LQSWQPPREAESLQPMTWGTDYVFHNDTK 

WFLSPAVPEEPEAYNLTVLIEMDGHRAIXRT 

EAG AFEY VPDPTFENFTGG VKKQVNKL IRAR 

GTNLNKAMTLQEAEAFVGAERCTMKTLTET 

DLYCEPPEVQPPPKRRQKRDTTHNLPEFIVKF 

GSREWVLGRVEYDTRVSDVPLSLILPLVIVPM 

VWIAVSVYCYWRKSQQAERJEYEKJKSQLEG 

LF^SVRDRCKKEFTDLMIEMEDQTNDVHEAG 

IPVLDYKTYTDRVFFLPSKDGDKDVMITGrCL 

DIPEPRRPWEQALYQFSKLLNSKSFLINF1HT 

I VFNOPFFSARAKVYFA^I I TVA1 HOKI FYYT 

DIMHTLFLELLEQYVVAKNPKLMLRRSETVV 

ERMLSNWMSICLYQYLKDSAGEPLYKLFKAI 

KHQVEKGPVDAVQKKAKYTLNDTGLLGDD 

VEYAPLTVSVTVQDEGVDAIPVKVLNCDTISQ 

VKEKIIDQVYRGQPCSCWPRPDSVVLEWRPG 

STAQILSDLDLTSQREGRWKRVNTLMHYNVR 

DGATLtLSKVGVSQQPEDSQQDLPGERHALL 

EEENRVWHLVRPTDEVDEGKSKRGSVKEKE 

RTKAITEIYLTRLLSVKGTLQQFVDNFFQSVL 

APGHAVPPAVKYFFDFLDEQAEKHNIQDEDTT 
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Amino acid sequence (A— Alanine C=Cysteine, 
D—Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine., G=Glycine, H=Histidine, 
J*»Isoleucine, K="Lysine, L=Leucine, 
M=Methionine, N«Asparagine, P=Proline, 
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T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














HlWKTNSLPLRfWVN[LKNPHFIFDVHVHEW 
D A SL S VIAQTFMDACTRTEHKJLSRDSPSNKXL 
Y AKE1 STYKKM VED YYKGIRQMVQ VSDQDM 
NTHLAEISRAHTDSLNTLVALHQLYQYTQKY 
YDE1IN ALEEDPAAQKMQLAFRLQQIAAALE 
NKVTDL 


839 


2189 


A 


6872 


1 


1485 


RARRLALQCHVCVCALTPGEQSGRRLPGQT 

WLMFSCFCFSLQDN SFS STTVTECDEDPVSLH 

EDQTDCSSLRDENNKENYPDAGALVEEHAPP 

SWEPQQQNVEATVLVDSVLRPSMGNFKSRKP 

KSIFiCAESGRSHGESQEIXHWSSQSECQVRA 

GTPAHESPQNNAFKCQET\VRL\QPRJDQRTAT 

SPKDAFETR\QDLNEEEAAQVHGVKDPAPAS 

TQSVLAVDGTDSADPSPVHKDGQNEADSAPE 

DLHSVGTSRLLL/YHITDGDNPTAVRHGCSL/F 

SGQSQRFNLDPESAPSPPSTQQFMMPRSSSRC 

SCGDGKEPQTTTQLTKHIQSLKRKIRKFEEKFE 

QEKKYRPSHGDKTSNPEVLKWMNDLAKGRK 

QLKELKLKLSEEQGSAPKGPPRNLLCEQPTVP 

RENGKPEAAGPEPSSSGEETPDAALTCLKERR 

EQLPPQEDSKVTKQDKNLIKPLYDRYRIIKQIL 

STPSLIPTIVSQDTCMLLLCTDV 


840 


2190 


A 


6873 


2 


2054 


FFRFYFSF1RLFAMSLADLTKTNIDEHFFGVAL 

ENNRRSAACKRSPGTGDFSRNSNASNKSVDY 

SRSQCSCGSLSSQYDYSEDFLCDCSEKAINRN 

YLKQPVVKEKEKKKYNVSKJSQSKGQKEISV 

EKKJ4TWNASLFNSQIHMIAQRRDAMAHRILS 

AM.HKIKGLKNTLADMHHKLEATLTENQFLK 

QLQLRHLKAJGKYENSQNNLPQIMAKHQNEV 

KNLRQLLRKSQEKERTLSRKLRETDSQLXKT 

KJD I LQ ALQKL SEDKNL AEREELTHKL S IITTK 

MDANDKKIQSLEKQLRLNCRAFSRQLAIETR 

KTLAAQTATKTLQVEVKHLQQKLKEKDREL 

EIKNI YSHRILKNLHDTEDYPKV SSTKS VQAD 

RKILPFTSMRHQGTQKSDVPPL7TTKGKKATG 

NIDHK^KSTEINHEIPHCVNKLPKQEDSKRKY 

EDLSGEEKHLEVQILLENTGRQKDKXEDQEK 

KNIFVKEEQELPPKIIEV1HPERESNQEDVLVR 

EKFKRSMQRNGVDDT\LGKGTAPYTKGPLRQ 

RRHYSFTEATENLHHGLPASGGPANAGNMR 

YSHSTGKHLSNREEMELEHSXDSGYEPSFGKS 

SRIK VKDTTFRDKKSSLMEELFGSG YVLK I D 

QSSPGVAKGSEEPLQSKESHPLPPSQASTSHA 

FGDSKVTVVNSIKPSSPTEGKRKI1I 


841 


2191 


A 


6874 


3 


2867 


SSRTREMEEKEILRRQIRLLQGL1DDYKTLHG 

NAPAPGTPAASGWQPPTYHSGRAFSARYPRP 

SRRGYSSHHGPSWRKKYSLVNRPPGPSDPPA 

DHAVRPLHGARGGQPPVPQQHVLERQVQLS 

QGQNWIKVKPPSKSGSASASGAQRGSLEEFE 

DTPWSDQRPREGEGEPPRGQLQPSRPTRARG 

TCSVEDPLLVCQKEPGKPRMVKSVGSVGDSP 

REPRRTVSESV1AVKASFPSSALPPRTGVALG 

RKLGSHSVASCAPQLLGDRRVDAGHTDQPVP 

SGSVGGPARPASGPRQAREASLWTCRTNKF 

RKNNYKWVAASSKSPRVARRALSPRVAAEN 

VCKASAGMANKVEKPQLIADPEPKPRKPATS 

SKPGSAPSKYKWKASSPSASSSSSFRWQSEAG 

SKDHASQLSPVLSRSPSGDVRPALAHSGLKPLS 

GETPLSAYKVKTRTKIIRRRGSTSLPGDKKSG 

TSPAAT AKSFtLSLRRRQ ALRGKS SP V LKKTFN 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














KGLVQVTKHRLCRLPPS RAHLPTKEAS SLHA 
VRTAPTSKVIKTRYRTVKKTPASPLSAPPFPLS 
LPSWRARRLSLSRSLVLNRLRPVASGGGKAQ 
PGSPWWRSKGYRCIGGVLYKVSANKLSKTSG 
OPSDAGSRPLLRTGRLDPAGSCSRSLASRAVO 
RSI^IIRQARQRREKRXEYCMYYNRFGRCNR 
GERCPYIHDPEKVAVCTRFVRGTCKKTDGTC 
PFSHHVSKEKMPVCSYFLKGICSNSNCPYSHV 
YVSRKAEVCSDFLKGYCPLGAKCKKKHTLLC 
PDFARRGACPRGAQCQLLHRTQKRHSRRAAT 
SPAPGPSD ATARSRVSA SHGPRKPSA SQRPTR 
'QTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
S S SSSSS SPP ASLDHEVAPSLQE AALAAACSNR 
LCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDS 
GKPLH1KPRL 


842 


2192 


A 


689S 


506 


2071 


WPDLVHTWSSFFAMGSCCSCPDKDTVPDNH 
RNKFK VIN VDDDGNELG SGEMELTDTELIL YT 
RKRDS VK WHYLCLRRYGYD SNLFSFESGRRC 
QTGQGIFAFKCARAEELFNMLQEIMQNNSIN 
WEEPWERNNHQTELEVPRTPRTPTTPGFAA 
QNLPNGYPRYPSFGDASSHPSSRHPSVGSARL 

p^vr;PF«;THPi i VAFPnvHTVVMTrnvnFPR 
r j> v udlij i nr ll. v /vtllvL^; vni i v in i j vj v v^CiCiv 

KNRTSVHVPLE ARV SNAESSTPKEEPS S IEDR 

DPQILLEPEGVKFVLGPTPVQKQLMEKEKLE 

QLGRDQVSGSGANNTEWDTGYDSDERRDAP 

SVNKLVYENINGLSIPSASGVRRGRLTSTSTSD 

TQNINNSAQRRTALLNYENLPSLPPVWEARK 

LSRDEDDNLGPKTPSLNGYHNNLDPMHNYV 

NTFN VTVP A S A HK I F Y SR_R RnCTPnrVFNFniR 

RPSLEHRQLNYIQVDLEGGSDSDNPQTPKTPT 

TPLPQTPTRRTELYAVIDIERTAAMSNLQKAL 

PRDDGTSRNKTRHN STVDLPL 


843 


2193 


A 


6919 


2 


663 


AGRPGTTHASGKMAYQSLRLEYLQIPPVSRA 

YTTACVLTTAAVQLELITPFQLYFNPELIFKHF 

QI WRLITNFLFFGP VGFNFLFNMIFL YRYCRM 

LEEGSFRGRTADFVFMFLFGGFLMTLFGLFVS 

L/VFLGPGLYNN/GSSMCGAE\EPLCPHELLRP 

SQLPGPLSALGAHGIFLWGELNHCGPFGYCS 

WTFnFFLGRCISQSTWWNKNSE>rnYFESYF 


844 


2194 


A 


6928 


902 


366 


\ IRLCMPIQGACGERME/F SLLLPGLECNGVIL 
AHCNLRLPGSSNSPASASQVAGITGVCHHAR 
LIFVFSVETGFLHAGQAGLELLTSGDPPASAS 
QSAGITGKSQHTRPGYEFnPYSAAQEDALKA 
LM 


845 


2195 


A 


6939 


1660 


317 


LYPENLGESLFPILLLPPPWPDGGRPCCVEMS 

TRAKKJLRRIWRJLEEKESVAGAVQTLLLRSQE 

GGV\TSAAASrLSEFPRRTX3ESRTRTRALGLPT 

LPMEKLAASTEPQGPRPVLGRESVQVPDDQD 

FRSFRSECEAEVGWNLTYSRAGVSVWVQAV 

EMDRTLHKDCCRMECCDVPAETLYDVLHDIE 

YRKKWDSNVIETFDIARLTVNADVGYYSWR 

CPKPLKNRDVITLR5WLPMGADYTIMNYSVX 

HPKYPPRKDLVRAVSIQTGYLIQSTGPKSCVrr 

YL AQ VDPKGSLPKWWNKS SQFLAPKAMKK 

MYKACLKYPEWKQKHLVPHFKPWLVHPEQSP 

LPSLALS\ELSVQHADS\LENTOESAV\AESREE 

R\MGGAGGEGVSDDDTSLYAEAPHRFRETETG 

PGAGRALGAAAAPALSPLHPPGTWWHRARP 

RRVLQPGWTEPQ 
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846 


2196 


A 


6944 


42 


2672 


RRKMAGCRGSLCCCCRWCCCCGERETRTPE 

ELTILGETQEEEDEILPRKDYESLDYDRCINDP 

YLEVLETMDNKKGRRYEAVKWMVVFAIGV 

CTGL VG L FVDFFVRL FTQ L KFG WQTS VEECS 

QKGCLALSLLELLGFNLTFVFLESLLGLIEPVE 

AGSGITEGKCYLYARQVPGLVRLPTLLWKAL 

GVLLTVAAMLL1\GLGSPMIHSGSVVGAGLPQ 

FQSISLRKIQFNFPYFRSDRYGKVDKRDFVSAG 

AAAGVAAAFGAPIGGTLFSLEEGSSFWNQGL 

TWKVLFCSMSATFTLNFFRSGIQFGSWGSFQL 

PGLLNFGEFKCSDS DKKCHLWT AMDLGFF V 

V M G VTG GLL G ATFNC LNKRL AX YRMRN VHP 

KPKLVRVLESLLVSLVTTVVVFVASMVLGEC 

RQMSSSSQIGNDSFQLQVTEDVNSSIKTFFCP 

NDTYNDMATLFFNPQESAILQLFHQDGTFSPV 

TL ALFFVLYFLLAC WTYGISVPSGLFVPSLLC 

GAAFGRLVANVXKSYIGLGH1YSGTFALIGAA 

AFL GGWRMTISLTVIL IESPiNEITYGLPIMVT 

LMVGKWTGDFFNKGI\YDIHVGLRGVPLLEW 

ETEVEMDKLRASDIMEPNLTYVYPHTRIQSLV 

S O-RTTVHHAFP V VTENRGNEKEFMKGNQLI S 

NNIKFKKSSILTRAGEQRKRSQSMKSYPSSEL 

RNMCDEHIASEEPAEKEDLLQQMLERRYTPY 

PNLYPDQSPSEDWTMEERFRPLTFHGHLRSQ 

LVTLL VRG VC YSESQS SASQPRLS Y AEMAED 

YPRYPDIHDLDLTLLNPRMIVDVTPYMNPSPF 

TVSPNTHVSQVFNLFRTMGLRHLPVVNAVGE 

IVGIITRHNLTYEFLQARLRQHYQTI 


847 


2197 


A 


6951 


3 


1994 


NTNS SS VTN S AAG VEDLNI VQ VTVPDNEKER 

LSSIEKIK.QLREQVNDLFSRKFGEAIGVDFPVK 

VPYRKITFNPGCWIDGMPPGVVFKAPGYLEI 

SSMRRILEAAEFIKFTVIRPLPGLELSNGEYST 

VGKRKIDQEGRVFQEKWERAYFFVEVQNIST 

CLICKRSMS V SKEYNLRRHYQTNHSKH YDQY 

MERMRDEKLHELKKGLRKYLLGLSDTECPfi 

QKQVFANPSPTQKSPVQPVEDLAGNLWEKLR 

EKIRSFVAYSIAIDEJTDINNT1 QLAIFIRGVDE 

NFDVSEELLDTVPNfTGTKSGNEIFSRVEKSLK 

NFCINW SKLVSV ASTGTPPM VDANNGLVTKL 

KSRVATFCKG AELKS ICCHHPESLC AQ\KLKM 

DKVMDVVVKSVNWICSRGLNHSEFTTLLYEL 

DSQYGSLLYYTEIKWLSRGLVLKRFFESLEEI 

DSFMSSRGKPLPQLSSIDWIRDLAFLVDMTM 

HLNALNISLQGHSQIVTQMYDLIRAFLAKLCL 

WETHLTRNNLAHFPTLKLVSRNESDGLNY1P 

KIAELKTEFQKRLSDFKLYESELTLFSSPFSTKI 

DSVHEELQMEVIDLQCNTVLKTKYDKVGIPE 

FYKYLWGSYPKYKHHCAKILSMFGSTYICEQ 

LFSIMKLSKTKYCSQLKDSQWDSVLHIAT 


848 


2198 


A 


6985 


3 


289 


SVQYLPGRPTRTHASTDAPLMLKFTPLPSKTK 
ASAPVQCLLLMAATFSPOGLAKPHSGTTPmC 
CFNAINTKIPIQRLESYTRnNlQCPKEAVM 


849 


2199 


A 


6999 


963 


5 


LDFLCHRDMGDNITS1TEFLLLGFPVGPRIQM 

LLFGLFSLFV'VFTLLGNGTILGLISLDSRLHAP 

MYFFLSHLVAWDLAYACNTVPRMLVNLLHP 

AKPISFAGRMMQTTTJFSTFA VTECLLL VVMS 

YDLYV\AICHPLRYLAJMTWRVCITLAVTSWT 

TGVLLSLEHLVLLLPLPFCRPQKIYHFFCEILA 

VLKLACADTHINENMVLAGAISGLVGPLSTIV 

VSYMCILCAILQIQSREVQRKAFCTCFSHLCVI 
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GLFYGTAIIMYVGPRYGNPKEQKKYLLLFHS 
LFNFMLNPLICSLRNSEVKNTLKRVLGVERAL 


850 


2200 


A 


7001 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLA1 

DPLRV APLPL YAAIFL VG VPGN AMV A WV AG 

KVARRRVGATWLLHLAVADLLCCLSLPILAV 

PIARGGHWPYGAVGCRALPSIILLTMYASVLL 

LAAL S ADLCFLALGP A WXCLRFS/G ACG VQ VA 

CGAAWTLALLLTVPSAIYRRLHQEHFPARLQ 

CWDYGGSSSTENAVTAIRFLFGFLGPLVAVA 

SCHSALLCWAARRCRPLGTAIWGFFVCWAP 

YHLLGL VLTVAAPN S ALLARALRAEPL I VGL 

ALAHSCLNPMLFLYFGRAQLRRSLPAACHW 

ALRESQGQDESVDSKKSTSHDLVSEMEV 


851 


2201 


A 


7011 


1 


2310 


AAASPLRMSRKGPRAEVCADCSAPDPGWASI 

SRGVLVCDECCSVHRSLGRHISIVKHLRHSA 

WPPTLLQMVHTLASNGANSIWEHSLLDPAQV 

QSGPALK.QTPKJDKV\HPIKSEFIRAKYQMLAF 

VHKLPCRDDDGVTAKDLSKQLHSSVRTGNLE 

TCLRLLSLGAQANFFHPEKGTTPLHVAAKAG 

QTLQAELLWYGADPGSPDVNGRTPIDYARQ 

AGHHELAERLVECQYELTDRLAFYLCGRKPD 

HKNGH YIIPQMAD SLDLSEL AKAAKKKLQ AL 

SNRLFEELAMDVYDEVDRRENDAVW1 ,ATQN 

HSTLVTERSAVPFLPVNPEYSATRNQGRQKL 

ARFNAREFATLIIDILSEAKRRQQGKSLSSPTD 

NLELSLRSQSDLDDQHDYDSVASDEDTDQEP 

LRSTGATRSNRARSMDSSDLSDGAVTLQEYL 

REIHKLQAENLQLRQPPGPVPTPPLPSERAEH 

TPMAPGGSTHRRDRQAFSMYEPGSALKPFGG 

PPGDELTTRLQPFHSTELEDDAIYSVHVPAGL 

YRTRKGVSASAVPFTPSSPLLSCSQEGSRHTSK 

LSRHGSGADSDYENTQSGDPLLGLEGKRFLE 

LGKEEDFHPELESLDGDLDPGLPSTEDVILKT 

EQVTKNIQELLRAAQEFKHDSFVPCSEKJHLA 

VTEMASLFPKRPALEPVRSSLRLLNASAYRJLQ 

SECRKTVPPEPGAPVDFQLLTQQV1QCAYDIA 

KAAKQLVTITTREKKQ 


852 


2202 


A 


7016 


484 


1777 


RISKJQVYYSTGYSSRKMNPTLGLAIFLAVLL 

TVKGLLKPSFSPRNYKALSEVQG WKQRMAA 

KELARQNMDLGFKLLKKLAFYNPGRNIFLSP 

LSISTAFSMLCLGAQDSTLDEIKQGFNFRKMP 

EKDLHEGFHYIIHELTQKTQDLKLSIGNTLF1D 

QRLQPQRKFLEDAKNFYSAET1LTNFQNLEM 

AQKQINDFI/ES KTHGK INNLIENI D PGTVM LL 

ANYrFFRARWKHEFDPNVTKEEDFFLEKNSS 

VKVPMMFRSGIYQVGYDDFCLSCTTLEIPYQK 

NTTAIFILPDEGKLKLHLEKGLQVDTT'SRWKTL 

LSRRWDVSVPRLHMTGTFDLKKTLSYIGVS 

OFEEHGDLTKIAPHRSLKVGEAVNKAELKM 

DERGTEGAAGTGAQTLPMETPLVVKIDKPYL 

LLIYSEKIPSVLFLGKIVNPIGK 


853 


2203 


A 


7017 


1 


3293 


MTHACNPSTLGGQGRRITRSHGRRRSSRGPV 

ARHVAAGAGHENKHGGSRRFPAGVAPRRAM 

ANVSKKVSWSGRDRDDEEAAPLLRRTARPG 

GGTPLLNGAGPGAARQSPRSALFRVGHMSSV 

ELDDKLLEPVDMDPPHPFPKEIPHNEKLLSLKY 

ESLDYDNSF^QLFLEEERRINHTAFRTVEIKR 

WVICALIGILTGLVACFIDIVVENLAGLKYRVI 

KGSILPNIDKFTEKGGLSFSLLLWATLNAAFV 
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LVGSV1VAFIEPVAAGSGIPQIKCFLNGVKIPH 

WR1.KTLVIKVSGVILSWGGLAVGKEGPMI 

HSGSVIAAGISQGRSTSLKRDFKJFEYFRRDTE 

KRDFVSAGAAAGVSAAFGAPVGGVLFSLEEG 

ASF WN QFLTWRI FF ASMI STFTLNFVL S 1 YHG 

NMWDLSSPGLINFGRFDSEKMAYTIHEIPVFI 

AMGVVGGVLGAVFNALNYWLTMFRIRYTHR 

PCLQVIEAVLVAAVTATVAFVLIYSSRDCQPL 

QGG S MS YPLQLFC ADGE YNSMAAAFFNTPEK 

SWSLFHDPPGSYNPLTLGLFTLVYFFLACWT 

YGLTVSAGVFIPSLLIGAAWGRLFGISLSYLTG 

AAIWADPGKYALMGAAAQLGGIVRMTLSLT 

VIMMEATSNVTYGFPIMLVLMTAXrVGDVFIE 

GLYDMHIQLQSVPFLHWEAPVTSHSLTAREV 

MSTPVTCLRRREKVGVrVDVLSDTASNHNGF 

PWEHADDTQPARLQGLILRSQLIVLLKHKVF 

VERSNLGLVQRRLRLKDFRDAYPRFPPIQSIH 

VSQDERECTMDLSEFMNP SPYTVPQEASLPR 

VFKJLFRALGLRHLVYVDNRNQVVGLVTRKD 

LARYRLGKRGLEELSLAQTGPKAQATAEGRV 

AG AAQQPCQLRA VTLEDLGLLLAG G LAS PEP 

LSLEELSERYESSHPTSTASVPEQDTAKHWNQ 

LEQ WV VELQ AEVACLREI DCQRCERATRSLL 

RELLQVRARVQLQGSELRQLQQEARPAAQAP 

EKEAPEFSGLQNQMQALDKRLVEVREALTRL 

RRRQVQQEAERRGAEQEAGLRLAKLTDLLQ 

QEEQGREVACGALQKNQEDSSRRVDLEVAR 

M 


854 


2204 


A 


7037 


139 


2604 


AGTWEPRPYDQAKETGAPGSQPPVPPMELRP 

WLL W W AATGTL VLL AADAQGQK VFTNTW 

AVRJPGGPAVANSVARKHGFLNLGQIFGDYY 

H F WHRG VTKRSLSPHRPRHSRLQREPQ VQ WL 

EQQVAKRRTKRDVYQEPTDPKFPQQWYL\SG 

VTQSRDLMVKAAWAQGYTGHGlVVSrjLDDGI 

EKNHPDLAGNYDPGASFDVNDQDPDPQPRY 

TQMNDNRHGTRCAGEVAAVANNGVCGVGV 

AYNARIGGVRMLDGEVTDAVEARSLGLNPN 

HIHTYSASWGPEDDGKTVDGPARLAEEAFFR 

GVSQGRGGLGSIFVWASGNGGREHDSCNCD 

GYTNSIYTLSISSATQFGNVPWYSEACSSTLA 

TTYSSGNQNEKQIVTTDLRQKCTESHTGTSAS 

APLAAGIIALTLEANKNLTWRDMQrlLVVQTS 

KPAHLNANDWATNGVGRKVSHSYGYGLLD 

AGAMVALAQNWTTVAPQRKCnDILTEPKDI 

GKRLEVRKTVTACLGEPNHITRLEHAQARLT 

LSYNRRGDLAIHLVSPMGTRSTLLAARPHDY 

SADGFNDWAFMTTHSWDEDPSGEWVLEIEN 

TSEANNYGTLTKFTLVLYGTAPEGLPVPPESS 

GCKTLTSSQACWCEEGFSLHQKSCVQHCPP 

GFAPQVLDTHYSTF^NDVH TIRASVCAPCFLAS 

CATCQGPALTDCLSCPSHASLDPVEQTCSRQS 

QSSRESPPQQQPPRLPPEVEAGQRLRAGLLPS 

HLPEWAGLSCAFIVLVFVTVFLVLQLRSGFS 

FRGVKVYTMDRGLISYKGLPPEAWQEECPSD 

SEEDEGRGERTAFIKDQSAL 


855 


2205 


A 


7058 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLA 

ACDWGFDLDHTLCRYNLPESAPLIYNSFAQF 

LVXEKGYDKELLhrVTPEDWDFCCKGLALDL 

EDGNFLKLANNGTVLPvASHGTKMMTPEVLA 

EAYGKKEWKHFLSDTGMACRSGKYYFYDN 
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YFDLPGALLCARVVDYLTKLNNGQKTFDFW 

KDIVAAIQHNYKMSAFKENCGIYFPEIKRDPG 

RYLHSRPESVKKWLRQLKNAGKILLLITSSHS 

DYCRLLCA\YTLGNDFTDLFDIVITNALKPGFP 

SHLPSQRPFRTLENDEEQEAI.PSI.DKPGWYSQ 

GNAVHLYELLKKMTGKJPEPKVVYFGDSMHS 

DIFPARHYSNWETVLILEELRGDEGTRSQRPE 

ESEPLEKKGKYEGPKAKPLNTSSKKWGSFFU 

DSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

E AIAELPLD YKFTRFSS SNSKT AG Y YPNPPLV 

LSSDETLISK 


856 


2206 


A 


7082 


396 


1635 


SSPSVFEFEHAVQPVFTMEFLKTCVLRRNACT 
AVCFWRSKVVQKPSVRRISTTSPRSTVMPAW 
VIDKYGKNEVLRFTQNMMMPI1HYPNEVIVK 

VH A AS VNPinVNMP QnYf? AT Al NtVTkf R F)PT H 

VKIKGEEFPLTLGRDVSGVVMECGLDVKYFK 

PGDEVWAAVPPWKQGTLSEFWVSGNEVSH 

KPKSLTHTQAASLPYVALTAWSAINKVGGLN 

DKNCTGKRVLILGASGGVGTFA1QVMKAWD 

AHVTAVCSQDASELVRKLGADDVEDYKSGSV 

EEQLKSLKPFDFILDNVGGSTETWAPDFLKK 

W SG ATYVTL VTPFLLNMDRLG I ADGMLQTG 

VTVGSKALKHFWKGVHYRWAFFMASGPCL 

DD1AELVD AGKI RPVUEQTFPF SK VPEAFLKV 

ERGHARGKTVrNVV 


857 


2207 


A 


7088 


320 


2417 


LRRRKMTPQ S L L QTTL FLLSLLFLV QG AHGR 

GHREDFRFCSQRNQTHRSSLHYKPTPDLRISIE 

NSEEALTVHAPFPAAHPASRSFPDPRGLYHFC 

L Y WNRHAGRLHLLY GKRDFLLSDKASSLLCF 

QHQEESLAQGPPLLATSVTSWWSPQK1SLPSA 

ASFTFSFHSPPHTGAHNASVDMCELKRDLQL 

LSQFLKHPQKAS RRPS AAPASQQLQ SLESKLT 

SVRFMGDMGSFEEDR1NATVWKLQPTAGLQ 

DLHIHSRQEEEQSEIMEY SVLLPRTLFQRTKG 

RSGEAEKRLLL VDFSSQALFQDKN SSQVLGE 

KVLGIVVQNTKVANLTEPWLTFQHQLQPKN 

VTLQCVFWVEDPTLSSPGHWSSAGCETVRRE 

TQTSCFCNHLTYFAVLMVSSVEVDAVHKHY 

LSLLSYVGCWSALACLVTIAAYLCSRVPLPC 

RRKPRDYTIKVHMNLLLAVFLLDTSFLLSFPV 

ALTG SEAGCRAS AIFLHFSLLTCLSWMGLEG 

YNLYRJLWEVFGTYVPGYLLKXSAMGWGFPI 

FLVTLVALVDVDNYGPIILAVHRTPEGVIYPS 

MCWIRDSLVSYITNLGLFSLVFLFNMAMLAT 

MVVQILRLRPHTQKWSFTVLTLLCLSLVLGVLP 

WALtFFSFASGTFQLVVLYLFSirTSFQGFLIFI 

WYWSMRLQARGGPSPLKSNSDSAPJLPISSGS 

TSSSR1 


858 


2208 


A 


7091 


185 


415 


DAGA VKSSDTNI WFRGMCDDKKGHRCPS * G 

QPQHFHVAFHTEA£GAMFYFRLHVIHRVMQS 

QQQLFPSTLFSWLLE 


859 


2209 


A 


7136 


3 


302 


FFFWRQSLALLPRLECSGATGAHCNLHFPGSS 
DCPTSAS* LAGITG ACYHA WLLFVFLAETGFH 
HVGQGGLELLTSSDPSGSASQSAGITGVSHCT 
WPI 


860 


2210 


A 


7156 


23 


591 


ALSTETRTP D MRRLLL VT SL VWLL WEAG A V 
PAPKVPIKMQVKHWPSEQDPEKAWGARWE 
PPEKDDQLWLFPVQKPKIXTTEEKPRGQGR 
GPLLPGTKA V/METEDTLGRVLSPEPDHD SLY 
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HPPPEEDQGEERPRLWVMPNHQVLLGPEEDQ 

DHTYHPQ*GSRGHHCPRPVPRPRLLGLGPSLP 

CPS 


861 


2211 


A 


7161 


1220 


1003 


NYVCTIAF*EKKMGF* LSLSCLVLLFVLFLDCI 
L I'l'l'l RIMFHCTYLFASVCLSLLNTLLSPNCL 
KSAMILQ 


862 


2212 


A 


7211 


665 


847 


LKYYHITMGIYKTGKKVIL*KSSMSNRFSVIF 
YKNIQKLSFSNYVYHQNYVFSSDWSYDF 


863 


2213 


A 


7212 


924 


1273 


HGSSCALGDLAPG*LPSGPVLSSPAVRL*RKP 
L V WDSPSCLP ATGPT * GLVL VLGGPDCT* W A 
RGQHEHKRMRAP+ SCR VTVNLAKKKKKTDQ 
CIKPNYQSPPKECDYN1LANSVA 


864 


2214 


A 


7214 


845 


1619 


SDKGGKKAORKNHLRHAFPLLPHRVRERLH 

DPKVPVDADHVQGQDPGRAAHD1HGEDVTE 

KVSKDPLAPDEVGDTDEGHDRHGHREVGQR 

HGHDQEEVAYEERACEGGKFATVEVTDKPV 

DEALREAMPKVAKYAGGTNDKGIGMGMTV 

PISFAVFPNEDGSLQKKLKVWFRIPNQFQSDP 

PAPSDKSVKIEEREGITVYSMQFGGYAKEAD 

YVAQATRLRAALEGTATYRGDIYFCTGYDPP 

MKPYGRRNEIWLLKT 


865 


2215 


A 


7246 


559 


682 


RRLG A V AHA YTS STLGGRGG WIT* GQELQTS 
LANMAKPRLY 


866 


2216 


A 


7257 


641 


1310 


tctykylmgwirgrrsrhswemsefhnynl 
dlkksdfstr wqkqrcpwk skcrena spff 
fccfiavamgirfiimvaiwsavflnslfnqev 
qipltes ycgpcpknwi cyknnc yqffdeskn 
wyesqascmsqnasllkvyskedqdllk.lv 
ksyhwmgl vhiptngs wq we dg sil spnllt 

IIEMQKGDCALYASSFKGYIENCSTPNTYICM 
QRTV 


867 


2217 


A 


7288 


151 


396 


SIKIIEAFGSNGPDPWFFRYWSP*LFRQQWF1 
MPFFQTLWLMN ANRFC S1FTTTNV ANNC W W 
TPYHCWLSWVCRCESHGI 


868 


2218 


A 


7298 


3 


272 


PDTVIGGRGSGGKEFGRWVLW*VFE*RLGTP 
KGSCPAGGSRMVSESD*EGRGC*ASYPCAC* 
AGS*WR+GSRPAGRGTPPRSLSHARPP 


869 


2219 


A 


7332 


1223 


332 , 


PRRDAEDRDESCLNPAFPIGLLHPNSVNSMAR 

FLTLC1WLLLLGPGLLATVRAECSQDCATCS 

YRLVRPADtNFLACVMECEGKLPSLKJWETC 

KELLQLSKPELPQDGTSTLREN SKPEESHLLA 

KRYGGFMKRYGGFMKKMDELYPMEPEEEA 

NGSEIL AKRYGGFMKKDAEEDDSLAN SSDLL 

KELLETGDNRERSHHQDGSDNEEEVSKRYGG 

FMRGLKRSPQLKEKAKELQKRYGGFMRRVG 

PQKW*MTSPQNRYGGFLKRFAEALPSDEEGE 

SYSKEVPEMEKRYGGFMRF 


870 


2220 


A 


7382 


216 


1018 


EIHQRLTERTQFLDESRKNPNS*QANLLRGGG 

AGQGRGREGAESGGSRGEGPGSDGRLPATGD 

FWSPRS QRRGCCGRRAPRPEAMENG A V YSPT 

TEEDPGPARGPRSGLAAYFFMGRLPLLRRVL 

KGLQLLLSLLAFICEEVVSQCTLCGGLYFFEF 

VSCSAFLLSLLILIVYCTPFYERVDTTKVKSSD 

FYTn.GTGCVFLLASIIFVSTHDRTSAEIAATVF 

GFI ASFMFLLDFITMLYEKRQESQI ,RKPEN1T 

RAEALTEPLNA 


871 


2221 


A 


7403 


3 


393 


SCAMCSGLL*LLLPIWLSWTLGTRGSEPRSVN 
DPGNMSFVKETVDKLLTGFRCFREREAAPRR 
ALRGAALPGESEAGDPESLRSSVNADW1QYS 
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DLWEAEVSTPRCEAGFCQECFRTPGNQEKDG 
PFIC 


872 


2222 


A 


7413 


1061 


359 


FVDIVSWEFPHCPEARFPAQHGQDSKRJLTLC 

PGGS*PQATLHIX>RMRVSASPTKEIQVKKYK 

CGLIKPCPA>rYFAFKICSGAANVVGPTMCFED 

RMIMSPVKNNVGRGLNIALVNGTTGAVLGQ 

KAFDMYSGDVMHLVXFLKE1PGGALVLVAS 

YDDPGTKMNDESRKLFSDLGSSYAKQLGFRD 

SWVFIGAKDLRGKSPFEQFLKEQPQTQNKYE 

GWPELLEMEGCMPPKPF 


873 


2223 


A 


7429 | 2242 


2394 


ILKCAGHGGSCL* SQHFGRLRWEDRLRLGVQ 
DHPGQHCETPSLLKJERKLF 


874 


2224 


A 


/Wo 




894 


PCTSCVLWATLHLPASmKAPQAECGMISlTE 
WQKIGVGITGFGIFFILFGTLLYFDSVLLAFGN 
LLFLTGLSLIIGLRKTFWFFFQRHKLKGTSFLL 
GGVVIVLLRWPLLGMFLETYGFFSLFKGFFPV 
AFGFLGNVCNIPFLGALFRRLQGTSSMV+KTE 
MSSLNLDHWLKGAKREEWEPPPQSPALTHSP 
TYPGPPQVQKERNGAE QLTSNPQ VDSRGCQE 
AEMQTPRRLGWGWYHTLTLYLWEEK 


875 


2225 


A 


7498 


91 


251 


GEKPVPTWLQDEAGQWLLGFVAQPWGWPG 
SERHEP*HGGVLFRLGPSAPPGKL 


876 


2226 


A 


7544 


403 


587 


YSCLCFLFKHITSFKNS VH I WLGT V VHA YNPN 
ILG GQGGWIA* GQEFKTSLGNT VRPCLYK 


877 


2227 


A 


7566 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGC 

TFKDKVLVAARRNASAWLYNEERYGN1TLP 

MSHAGTGNIWIMISYPKGREILELVQKGIPV 

TMTIGVGTRHVQEFISGQSVVFVAIAFITMMII 

SLAWLIFYYIQRFLYTGSQIGSQSHRKETKKVl 

GQLLLHTVKHGEKG1DVDAENCAVC1ENFKV 

KDIIRILPCKHIFHRICrDPWLLDHRTCPMCKL 

DVIKALGYWGEPGDVQEMPAPESPPGRDPAA 

NLSLALPDDDGSDESSPPSASPAESEPQCDPSF 

KGDAGENTAIXEAGRSDSRHGGPIS 


878 


2228 


A 


7586 


315 


1232 


ERSLLCKVDVRWIYVSEGTKTQRRHRQGSLR 

RGRMQAACWYVLFLLQPTVYLVTCANLTNG 

GKSELLKSGSSKSTLKHIWTESSKDLSISRLLS 

QTFRGKEKDTDLDLRYDTPEPYSEQDLWDW 

LRNSTDLQEPRPRAKRRPrSTCTGKFKKMFGW 

GDFHSNIKTVXLNLLITGKrVDHGNGTFSVYF 

RHNSTGQGNV S VSL VPPTKTVEFDLAQQT VID 

AKDSKSFNCRIEYEKVDKATKNTLCNYDPSK 

TCYQEQTQSHVSWLCSKPFKVIC1YISFYSTD 

YKLVQKVCPDYNYHSDTPYFPSG 


879 


2229 


A 


7605 


479 


391 


TES WKLK WWSPTCLDQLNGS APGNVFTHG 


880 


2230 


A 


7612 


93 


659 


D AA V AMTAQGGL V ANRGRRKK W A1ELS GPG 

GGSRGRSDRGSGQGDSLYPVGYLDKQVPDTS 

VQETDRILVEKRCWDIALGPLKQIPMNLFIMY 

MAGNTISIFPTMMVCMMAWRPIQALMAISAT 

FKMLESSSQKFLQGLVYLIGNLMGLALAVYK 

CQSMGLLPTHASDWLAFIEPPERMEFSGGGL 

LL 


881 


2231 


A 


7615 


291 


1452 


SPQKTMRSHTITNfrTTSVSSWPYSSHRMRFrr 
NHSDQPPQNFSATPKVTTCPMDEXLL STVLTT 
SYSVEFrVGLVGKILALYVFLGIHRKRNSIQIYL 
LKV AI ADL I X IFCLPFRIM Y} il NQNK WTL G VI L 
CKV V G'FLF Y MNMYISLLLLGF1SLDRY IKINRSI 
QQRKAITTKQSIYVCCIVWMLAJLGGFLTM1IL 
TLKXGGHNSTMCFHYRDKFTNAKGEAIFNFIL 
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VVMFWLIFLL1ILSYIKIGKNLLRISKRRSKFPN 
SGK Y A 1TARN SFI VLIIFTICFVP YHAFRFIYISS 
QLNVSSCYWKEIVHKTNEIMLVLSSFNSCLDP 
VMYFLMSSNIRKIMCQLLFRRFQGEPSRSEST 
SEFKPGYSLHDTSVAVKIQSSSKST 


882 


2232 


A 


7617 


67 


379 


RQMALLKANKDLISAGLKEFSVLXNQQVFND 
PLVSEEDMVTWEDWrvnNTYINYYRQQVTGE 
PQERDKALQELRQELNTLANPFLAKYRDFLK 
SHELPSHPPPSS 


883 


2233 


A 


7622 


400 


215 


KVKTCRYNPKYSAANDTGFVDIPSREKDLAK. 
AVATVGPISVAVGASHVFFQFYKKGKHLSS 


884 


2234 


A 


7638 


2640 


2861 


APVLILQMVKLSIVLTPQFLSHDQGQLTKELQ 
QHVKSVTCPCEYLRKVSECRQMGPGALEQFP 
GLSCHTSHSG 


885 


2235 


A 


7642 


201 


455 


PSRGKMELEAMSRYTSPVNPAVFPHLTVVLL 
AIGMFFTA WFF V YE VTSTKYTRDIYKELLI SL 
VASLFMGFGVLFLLLWVGIYV 


886 


2236 


A 


7692 


61 


569 


APENPFSRQHFNSETKVKLSLKTGTWLGNHA 

HLGEHFSTHHELGLSGKWGFLVKNILEVIRN 

GGMETRHPGKVSSWFHRWDSRAEQHNHAE 

HHEDVPQGDEDSKVSEAQQEFPDWTCAGLP 

GLLPKALRVLLFQLKVQHRPGIHQQRPHQQD 

VSDHRYGRSVRQNRK 


887 


2237 


A 


7693 


85 


315 


NPGCCLPVAMRTSYLLLFTLCLLLSEMASGG 

NFLTGLGHRSDHYNCVSSGGQCLYSACPIFTK 

IQGTCYRGKAKCCK 


888 


2238 


A 


7702 


242 


1298 


APSHRRRYLSPSRSAGQLGNMALERLCSVLK 

VLLITVLVVEGLAVAOKTODGONIGIKHIPAT 

QCGIWVRTSNGGHFASPNYPDSYPPNKECIYI 

LEAAPRQR1ELTFDEHYYIEPSFECRFDHLEVR 

DGPFGFSPLIDRYCGVKSPPL1RSTGRFMWIKF 

SSDEELEGLGFRAKYSFIPDPDFTYLGGrLNPIP 

DCQFELSG ADGIVRS SQ VEQEEKTKPG QA VD 

CIWTrXATPKAKIYLRFLDYQMEHSNECKRKF 

VAVYDGSSSIENLKAKFCSTVANDVMLKTGI 

GVIRMWADEGSRLNRFRMLFTSFGGASPAQA 

ALSFCHSNMCrNNSLVCNGVQNCAYPWDEN 

HC 


889 


2239 


A 


7707 


185 


2911 


CHY1MNPSTHHPASAGGS1LGLFDFFGLGLGE 

MTMDALLARLKLLNPDDLREEIVKAGLKCGP 

rrSTTRFIFEKKLAQALLEQGGRLSSFYHHEA 

GVTALSQDPQRILKPAEGNPTDQAGFSEDRDF 

GYSVGLNPPEEEAVTSKTCSVPPSDTDTYRAG 

ATASKEPPLYYGVCPVYEDVPARNERIYVYE 

NKKEALQAVKM1KGSRFKAFSTREDAEKFAR 

GICDYFPSPSKTSLPLSPVKTAPLFSNDRLKDG 

LCLSESETVNKERANSYKNPRTQDLTAKLRK 

AVEKGEEDTFSDLrWSNPRYLIGSGDNPTTVQ 

EGCRYNTVMHVAAKENQASICQLTLDVLENP 

DFMRLMYPDDDEAMLQKIURYVVDLYLNTP 

DKMGYDTPLHFACKFGNADWKVLSSHHLI 

VKNSRNKYDKTPEDVICERSKNKSVELKERIR 

EYLKGHYYVPLLRAEETSSPVIGELWSPDQTA 

EASH VSRYGG SPRDP VLTLRAF AGPLSP AKAE 

DFRKL WKrPPREKAGFLHHVKKSDPERGFER 

VGRELAHELGYPWVEYWEFLGCFVDLSSQE 

GLQRLEEYLTQQEIGKKAQQETGEREASCRD 

KATTSGSNSISVRAFLDEDDMSLEEIKNRQNA 

ARNNSPPTVGAFGHTRCSAFPLEQEADLIEAA 



268 



Printed from Mimosa 03/03/06 11:11:46 Page: 269 



WO 01/57188 



PCT/t SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 
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M=Methioninc, N=Asparagine\ P=ProIine, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y^Tyrosine, X = Unknown, *=Stop codoo, 
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nucleotide insertion 














EPGGPHSSRNGLCHPLNHSRTLAGKRPKAPR 
GEEAHLPPVSDLTVEFDKLNLQN IGRSVSKTP 
DESTKTKDQILTSRTNAVERDLLEPSPADQLG 
NGHRRTESEMSARIAKMSLSPSSPRHEDQLEV 
TREPARRLFLFGEEPSKLDQDVLAALECADV 
DPHQFPAVHRWKSAVLCYSPSDRQSWPSPAV 
K GRFK S QLPDLSGPHS YSPGRNS VAGSNPAKP 
GLGSPGRYSPVHGSQLRRMARLAELAAL 


890 


2240 


A 


7711 


360 


269 


RHMPVIPALWEAEVGGLLEPRSSRSAWATE 


891 


2241 


A 


7721 


61 


1175 


KLPWEPSFLIKMQI1RHSEQTLKTALISKNPVL 
V SQ YEKLD AGEQRLMNEAFQPA SDLFGPITL 
HSPSDWITSHPEAPQDFEQFFSDPYRKTPSPN 
KRSIYIQSIGSLGNTRIISEEYIKWLTGYCKAYF 
YGUWKLLEPVPVSVTRCSFRVNENIHNLQIH 
AG DILKFLKKKKPED AFCWG ITMIDLYPRDS 
WNF VFGQA SLTDG VG IFSFAR YGSDF YSMH Y 
KGKVJCKLKKTSSSDYSIFDNYYIPEITSVLLLR 
SCKTLTHEIGHIFGLRHCQWLACLMQGSNHL 
EEADRRPLNLCPICLHKLQCAVGFSJVERYKA 
LVRWDDDES SDTPG ATPEH SHEDNGNLPKPV 
EAFKEWKEWIIKCLAVLQK 


892 


2242 


A 


7723 


2 


1650 


SAPT AP ARPCRAERG SGGGML ALLAAS V ALA 

VAAGAQDSPAPGSRFVCTALPPEAVHAGCPL 

PAMPMQG G AQSPEEELRAA VLQLRETWQQ 

KETLASARA1RELTGKLARCEGLAGGKARGA 

GATGKDTMG DLPRDPGH WEQLSRSLQTLK 

DRLESLEPLPAMPMQGGAQSPEEELRAAVLQ 

LRETWQQ KETLA S ARAIRELT GKL ARC E GL 

AGGKARGAGATGKDTMGDLPRDPGHWEQ 

LSRSLQTLKDRLESLEHQLRANVSNAGLPGD 

FREVLQQRLGELERQLLRKGAELEDEKSLLH 

NETS AHRQKTESTLN ALLQRVTELERGN SAP 

KSPNAFKVSLPLRTNYLYGKUCKTLPELYAFT 

ICLWLRSSASPGMGTPFSYAVPGQAKErVLIE 

WGNNPIELLINDKVAQLPLFV SDGK WHH1CV 

T WJTRDGMWEAFQDGKKLGTGENL AP WHPI 

KPGGVLILGQEQDTVGGRFDATQAFVGELSQ 

FNIWDRVLRAQEIVNIANCSTNMPGNIIPWVD 

NNVDVFGGASKWPVETCEBRLLDL 


893 


2243 


A 


7729 


3554 


2419 


LTAGTAMNYPLTLEMDLENLEDLFWELDRL 
UN Y NJJ liLVhNrLLCPATEGPLMASFKAVFVP 
VAYSLIFLLGVIGNVLVLVILERHRQTRSSTET 
FLFHLAVADLLLVF1LPFAVAEGSVGWVLGTF 
LCKTVIALHKVNFYCSSLLLAC1AVDRYLATV 
HAVHAYRHRRLLS1HITCGTIWLVGFLLALPEI 
L.C AJt V Si^tiHNISaLPRCTr SQENQAETHA WF 
TSRFLYHVAGFLLPMLVMGWCYVGWHRLR 
^AyKKr^KyKAVKVAlLV I birr lA^WSPYHI V 

tFLDTLARLFCAVDNTCKLNGSLPVAJTMCEFL 
GL AJ ICCLNPMLYTF AG VKFRSDLSRLLTKLG 
CTGPASLCQLFPSWRRSSLSESENATSLTTF 


S94 


2244 


A 


7738 


670 


287 


FVTRAGRWGAGARVRGGAGGMASGAARWL 
VL APVRSGALRS GPSLRKDGD VS AA WSG SGR 
SLVPSRSVIVTRSGA1LPKPVKMSFGLLRVFSI 
VIPFLYVGTLISKNFAALLEEHDIFVPEDDDDD 
D 


895 


2245 


A 


7753 


119 


278 


AP Y AHSQ VHCLDKV CGLLPFLNPEVPDQFYR 
LWLSLFLHAGKEAPHCPRTRPL 


896 


2246 


A 


7754 


1 


372 


SPAWWNSQQRWSPFLALLTLEPTFHHLLPIM 
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QVSTAALAVLLCTMALCNQVLSAPLAADTPT 
ACCFS YTSRQEPQNUAD YFET SSQCSKPS VLFL 
TKRGRQVCADPSEEWVQKYVSDLELSA 


897 


2247 


A 


7761 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPL 

RPLSRHPLSSGSPETSAAAIMLLTVRHGTVRY 

RSSALLARTKNNIQRYFGTNSVICSKKDKQSV 

RTEETSKETSESQDSEKE>fTKKDLLGIIKGMK 

VELSTVNVRTTKPPICRRPLKSLEATLGRLRRA 

TEYAPKKRIEPLSPELVAAASAVADSLPFDKQ 

TTKSELLSQLQQHEEESRAQRDAKRPKJSFSNI 

ISDMKVARSATARVRSRPELRIQFDEGYDNYP 

GQEKTDDLKKRKNrFTGKRLNlFDMMAVTKE 

AFETDTSPSLWDVEFAKQLATVNEQPLQNGF 

EELIQWTKEGKLWEFPINNEAGFDDDGSEFH 

EHIFLEKHLESFPKQGPIRHFMELVTCGLSKNP 

YLSVKQKVEHIEWFRNYFNEKKD1LKESNIQF 

KLRPWKFLFRNN 


898 


2248 


A 


7775 1 


85 


496 


SCQTTQPPAQSCSTGTMRIMLLFTALLAFSLA 

QSFGAVCKEPQEEWPGGGRSKRDPDLYQLL 

QRLFKSHSSLEGLLKALSQASTDPKESTSPEK 

RDMHDFFVGLMGKRSVQPDSPTDVNQENVP 

SFGILKYPPRAE 


899 


2249 


A 


7785 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFI 

FSKSMNESMKNQKEFMLMNARLQLERQLIM 

QSEMRERQMAMQIAWSREFLKYFGTFFGLA 

AISLTAGATKKKKPAFLVPIVPLSFELTYQYDL 

GYGTLLERMKGEAEDILETEKSKLQLPRGMIT 

FESIEKARKEQSRFFIDK 


900 


2250 


A 


7789 


1465 


300 


VWLPLKSYKJRSPSLHCQCEIFREEFLFSSLQE 

GRDKDTFSKMAMVSEFLKQAWFIENEEQEY 

VQTVKSSKGGPGSAVSPYPTFNPSSDVAALH 

KAEN1VKGVDEAT1IDILTKRNNAQRQQIKAAY 

LQETGKPLDETLKKALTGHLEEWLALLKTP 

AQFDADELRAAMKGLGTDEDTLIEILASRTN 

KEIRDrNRVYREELKRDLAKDITSDTSGDFRN 

ALLSL AKGDRSEDFG VNEDLAD SD ARAL YE A 

GERRKGTDVNVFNTILTTRSYPQLRRVFQKY 

TKYSKHDMNKVLDLELKGDIEKCLTATVKCA 

TSKPAFFAEKLHQAMKGVGTRHKALERJMVS 

RSEIDMNDIKAFYQKMYGISLCQA1LDETKGD 

YEKILVALCGGN 


901 


2251 


A 


7796 


2 


807 


VEFHPQRARAGARAPSMGVLLTQRTLLSLVL 

ALLFPSMASMAAIGSCSKEYRVLLGQLQKQT 

DLMQDTSRLLDPYIRIQGLDVPKLREHCRERP 

GAFPSEETLRGLGRRCFLQTLNATLGCVLHRL 

ADLEQRLPKAQDLERSGLNIEDLEKLQMARP 

NILGLRNNTYCMAQLLDNSDTAEPTKAGRGA 

SQPPTPTPASDAFQRiCLEGCRFLHG YHRFMH 

SVGRVFSKWGESPNRSRRHSPHQALRKGVRR 

TRPSRKGKRLMTRGQLPR 


902 


2252 


A 


7802 


2 


721 


TAARRRQKGTAARRLQKGTAARRRQfCGTAA 
RRRQKGTAARRPQKGTAARRRQKGTAARRR 
QKGTAARRRQKGTAARRPQKGTAARRRQKG 
TAARRRQKGTAARRRQKGLAIASRGCPCASR 
AGGVRGAGSRLRAMAPKVFRQYWDIPDGTD 
CHRKAYSTTSIASVAGLTAAAYRVTLNPPGTF 
LEGVAKVGQYTFTAAAVGAVFGLTTCISAHV 
REKPDDPLN YFLGGC AGGLTLG ARTHN YGIG 
AAACVYFGLAASLVKMGRLEGWEVFAKPKV 
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903 


2253 


A 


7807 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMA 

VSTVFSTSSLMLALSRHSLLSPLLSVTSFRRFY 

RGDSPTDSQKDMIEIPLPPWQERTDESIETKR 

ARLLYE5RKRGMLENCILLSLFAKEHLQHMT 

EKQLNLYDRLINEPSNDWDIYYWATEAKPAP 

EIFENEVMALLRDFAKNKNKEQRLRAPDLEY 

LFEKPR 


904 


2254 


A 


7813 


40 


821 


GAGRALGHLETGAGDVAAALPARKFPRSLLG 

AGARLTGWTMNVFRILGDLSHLLAMILLLGK 

IWRSKCCKGISGKSQILFALVFTTRYLDLFTNF 

1 SIYNT VMKVVFLLC AYVTVYMTYGKFRKTF 

DSENDTFRLEFLLVPVTGLSFI ,ENYSFTLLECL 

WTFSIYLESVAILPQLFMISKTGEAETITTHYL 

FFLGLYRALYLANWTRRYQTENFYDQLAVVS 

GVVQTIFYCDFFYLYVTKGRSWDDSNADTGL 

RSYSSI 


905 


2255 


A 


7817 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEA 

QTEMVRTLERKLEAKMIKEESDYHDLESWQ 

QVEQNLELMTKRAVKAENHWKLKQEISLL 

QAQVSNFQRENEALRCGQGASLTWKQNAD 

VALQNLRWMNSAQASIEQLVSGAETLNLVA 

EILKSIDRISEVKDEEEDS 


906 


2256 


A 


7822 


3 


1462 


DSPRNRFErLGRPTRTFTRPGPRPAMEDLDAL 

LSDLElTrSHMPRSGAFKERPAEPLTPPPSYG 

HQPQTGSGESSGASGDKDHLYSTVCKPRSPK 

PAAPAAPPFSSSSGVLGTGLCELDRLLQELNA 

TQFNITDEIMSQFPSSKVASGEQKEDQSEDKK 

RPSLPSSPSPGLPKASATSATLELDRLMASLSD 

FRVQNHLP ASGPTQPP W SSTNEG SPSPPEPTG 

KGSLDTMLGLLQSDLSRRGVPTQAKGLCGSC 

NKPIAGQVVTALGRAWHPEHFVCGGCSTAL 

GGSSFFEKDGAPFCPECYFERFSPRCGFCNQPI 

RHKMVTALGTHWHPEHFCCVSCGEPFGDEG 

FHEREGRPYCRRDFLQLFAPRCQGCQGPILDN 

Y1SALSALWHPDCFVCRECFAPFSGGSFFEHE 

GRPLCENHFHARRGSLCATCGLPVTGRCVSA 

LGRRFHPDHFTCrFCLRPLTKGSFQERAGKPY 

CQPCFLKLFG 


907 


2257 


A 


7828 


1792 


1671 


FIYVNQSFAPSPDQEVGTLYECFGSDGKLVLH 
YCKSQAWG 


908 


2258 


A 


7842 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLG 

SLQPPPSGLKQSSHLSLSSSWDFRHAPTKPET 

YTCPKMIEMEQAEAQLAELDLLASMFPGENE 

LI VNDQL AV AELKDCIEKKTMEGRS SK VYFTI 

NMNLDVSDEKMAMFSLAC1LPFKYPAVLPEI 

TVRSVLLSRSQQTQLNTDLTAFLQKHCHGDV 

CILNATEWVREHASGYVSRDTSSSPTTGSTVQ 

SVDLIFTRLWIYSHHIYNKCKRKNILEWAKEL 

SLSGFSMPGKPGWCVEGPQSACEEFWARLR 

KLNWKRILIRIIREDIPFDGTNDETERQRKFSIF 

EEKVFSVNGARGNHMDFGQLYQFLNTKGCG 

DVFQMFLWV 


909 


2259 


A 


7870 


3067 


2923 


EGICVYTFIYVHMYTRTCMHTYPYMYMNSV 
L1SSEILLIPSKYLFESK 


910 


2260 


A 


7884 


212 


4874 


G ALT WSHPLLA VCPQGVWLG STPS GSP ALLP 
PSHRVNAEPGCVVTNACASGPCPPHANCRDL 
WQTFSCTCQPGYYGPGCVDACLLNPCQNQG 
SCRHLPGAPHGYTCDCVGGYFGHHCEHRMD 
QQCPRGWWGSPTCGPCNCDVHKGFDPNCNK 
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TNGQCHCKEFHYRPRG SDSCLPCDCYP VG ST 

SRSC APHSG QCPC RPG ALGRQCN SCDSPF A£ V 

TASGCRVLYDACPKSLRSGVWWPQTKFGVL 

ATVPCPRG ALGLRG AGAA VRLCDE AQG WLE 

P DL FN CT SP AF REL S LLL DGL ELNKT AL DTME 

AKKLAQRLREVTGHTDHYFSQDVRVTARLL 

AHLLAFESHQQGFGLTATQDAHFNENLLWA 

GSALLAPETGDLWAALGQRAPGGSPGSAGLV 

RHLEEYAATLARNMELTYLNPMGLVTPNIML 

SIDRMEHPSSPRGARRYPRYHSNLFRGQDAW 

DPHTHVLLPSQSPRPSPSEVLPTSSSIENSTTSS 

WPPPAPPEPEPGISIIIIXVYRTI,GGLLPAQFQ 

AERRGARLPQNPVMNSPWSVAVFHGRNFLR 

GILESPISLEFRLLQTANRSKAICVQWDPPGLA 

EQHGVWTARDCELVHRNGSHARCRCSRTGT 

FGVLMDASPRERLEGDLELLAVFTHVWAVS 

VAALVLTAAILLSLRSLKSNVROIHANVAAA 

LGV AELLFLLG I HRTHNQL VCTA WILLHYFF 

LSTFAWLFVQGLHLYRMQVEPRNVDRGAMR 

FYHALGWGVPAVLLGLAVGLDPEGYGNPDF 

CWISVHEPLIWSFAGPWLVIVMNGTMFLLA 

ARTSCSTGQREAKKTSALTLRSSFLLLLLVSA 

SWLFGLLAVNHSILAFHYLHAGLCGLQGLAV 

LXLFCVLNADARAAWMPACLGRKAAPEEAR 

PAPGLGPG AYNNT ALFEESGLIRITLGA STVS S 

VSSARSGRTQDQDSQRGRSYLRDNVLVRHGS 

AADHTDHSL Q AHAGPTDLD V AN4FHRD AG A 

DSDSDSDLSLEEERSLSIPSSESEDNGRTRGRF 

QRPLCRAAQSERLLTHPKDVDGNDLLSYWPA 

LGECEAAPCALQT WG SERRLGLDTSKD AAN 

NKQPDPALTSGDETSLGRAQRQRKGILKNRL 

QYPLVPQTRGAPELSWCRAATLGHRAVPAAS 

YGRIYAGGGTGSLSQPASRYSSREQLDLLLRR 

QLSRERLEEAPAPVLRPLSRPGSQECMDAAPG 

RLEPKDRGSTLPRRQPPRDYPOAMAGRFGSR 

DALDLGAPREWLSTLPPPRRTRDLDPQPPPLP 

LSPQRQLSRDPLLPSRPLDSLSRSSNSREQLDQ 

VPSRHPSREALGPLPQLLRAREDSVSGPSHGP 

STEQLDILSSILASFNSSALSSVQSSSTPLGPHT 

TATPSATASVLGPSTPRSATSHSISELSPDSEPR 

DTQALLSATQAMDLRRPJDYHMERPLLNQEH 

LEELGRWGSAPRrHQWRTWLQCSRARAYAL 

LLQHLPVLVWLPRYPVRDWLLGDIXSGLSVA 

JDMQLPQGL A YALL AGLPP VFGLY SSF YPVEIY 

FLFGTSRHISVESLCVPGPVDT 


911 


2261 


A 


7890 


21 


806 


EFGTSRSSRSMAEDLGLSFGETASVEMLPEHG 

SCRPKARS SS ARWALTCCLVLLPFLAGLTTYL 

LVSQLRAQGEACVQFQALKGQEFAPSHQQV 

YAPLRADGDKPRAFILTVVRQ TF1 QHFKNQFP 

ALHWEHELGLAFTKNRMNYTNKFLLIPESGD 

YFIYSQVTFRGMTSECSEIRQAGRPNKPDSrrV 

VTTKVTDSYPEPTQLLMGTKSVCEVGSNWFQ 

PIYLGAMFSLQEGDKLMVNVSDISLVDYTKE 

DKTFFGAFLL 


912 


2262 


A 


7891 


1263 


] 11 


A CGIRHEGALPGLT ATPE AMLRFLPDLAFSFL 
LILALGQAVQFQEYVFLQFLGLDKAPSPQKFQ 
P VP YILKKIFQDRE AAATTG V SRDLC YVKELG 
VRGNVLRFLPDQGFFLYPKKISQASSCLQKLL 
YFKI.SAIKEREQLTLAQLGI.DEGPNSYYNLGP 
ELELALFI.VQEPHVWGQTTPKPGKMFVLRSV 
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nucleotide insertion 














PWPQGAVHFNLLDVAKDWNDNPRKNFGLFL 

E1LVKEDRDSGVNFQPEDTCARLRCSLHASLL 

V\ r TLNPDQCHPSRKRRAAlPVPKLS CKNLCH 

RHQLFINFRDLGWHKWIIAPKGFMANYCHGE 

CPFSLTISLNSSNYAFMQALMHAVDPEIPQAV 

CIPTKLSPISMLYQDWIDNVILRHYEDMVVD 

ECGCG 


913 


2263 


A 


7892 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFAL 

YLLSTRLPRGRRLGSTEEAGGRSLWFPSDLAE 

LRELSEVLREYRKEKQAYVFLLFCGAYLYKQ 

GFAJPGSSFLNVLAGALFGPWLGLLLCCVLTS 

VGATCCYLLSSIFGKQLWSYFPDKVALLQR 

KVEENRNSLPFFLLFLRLFPMTPNWFLNLSAPl 

LMP1VQFFFSVLIGL1PYNFICVQTGS1LSTLTS 

LDALFSWDTVFKXLAIAMVALIPGTL1KKFSQ 

KHL QLNETSTANHIHS RKDT 


914 


2264 


A 


7893 


815 


959 


KSGWVWWLTPLIPALWEAQTEGSLRPEVKN 
RLSNTTR PFFSKKKK IL V 


915 


2265 


A 


7909 


3 


641 


HASGPGGLLRRRRGSGANMPVARSWVCRKT 

YVTPRRPFEKSRLDQELKLIGEYGLRNKREV 

WRVKFTLAKIRKAARELLTLDEKDPRRLFEG 

NALLRRLVRIGVLDEGKMKLDYILGLKIEDFL 

ERRLQTQVFKLGLAKSIHHAHVLIQQCHIRVR 

EQWNILFFTVRLDSQKHIDFSLCFPIGVANPS 

HVKRKNASKGQGGAGARDDEEEE 


916 


2266 


A 


7914 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHAC 

QVLILKHTI LASLSLPSCQECFPSSIPSASHMVS 

HPHPPPSPRWGQTPEGLPAASPCGPGPRSCFS 

SILPTGDSWGMLACLCTVLWHLPAVPALNRT 

GDPGPGPSIQKTYDLTRYLEHQLRSLAGTYLN 

YLGPPFNEPDFNPPRLGAETLPRATVDLEVW 

RSLNDKLRLTQNYEAYSHLLCYLRGLNRQAA 

TAELRRSLAHFCTSLQGLLGSIAGVMAALGY 

PLPQPLPGTEPTWTPGPAHSDFLQKMDDFWL 

L KEL QT WL WRS AKDFNRLKKKMQPPAAA VT 

LHLGAHGF 


917 


2267 


A 


7921 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRL 

HQYDGSIWIQNPARQTLFFNGTRALKDERFQ 

LEEFSPRRVRJRLSDARLEDEGGYFCQLYTED 

THHQIATLTVLVAPENPWEVREQAVEGGEV 

ELSCLVPRSRPAATLRWYRDRKELKGVSSSQ 

ENGKVWSVASTVRFRVDRKDDGGinCEAQN 

QALPSGHSKQTQYVLDVQYSPTARIHASQAV 

VREGDTLVLTCAVTGNPRFNQERWNRGNESL 

PERAEAVGETLTLPGLVSADNGTYTCEASNK 

HGHARALYVLVVYGESRLRPTEGGGGAPDP 

GA V VEAQTS VPYAIV GGILALL VFLIIC VL VG 

MVWCSVRQKGSYLTHEASGLDEQGEAREAF 

LNGSDGHKRKEEFFI 


918 


2268 


A 


7938 


3 


2653 


RRRLPPASPPSSSVSSSLSPSAWMACRWSTK 

ESPRWRSALLLLFLAGVYGNGALAEHSENVH 

ISGVSTACGETPEQIRAPSGirrSPGWPSEYPAK 

INCSWFIRANPGEimSFQDFDIQGSRRCNLD 

WLTIETYKNIESYRACGSTIPPPYISSQDHIWIR 

FHSDDNISRKGFRLA YF SGK SEEPNCACDQFR 

CGNGKCIPEAWKCNNMDECGDRSDEEICAKE 

ANPPTAAAFQPCAYNQFQCLSRFTKVYTCLP 

ESLKCDGNIDCLDLGDEIDCDVPTCGQWLKY 

F YGTFNSFNYPDF YPPG SNCTWLIDTGDHRK 
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VILRFTDFKLDGTG YGDY VKJ YDGLEENPHK 

LLRVLTAFDSHAFLTVVSSSGQ1RVHFCADKV 

NAARGFNATYQVDGFCLPWEIPCGGNWGCY 

TEQQRCDGYWHCPNGRDETNCTMCQKEEFP 

CSRNGVCYPRSDRCNYQNHCPNGSDEKNCFF 

CQPGNFHCKNNRCVFESWVCDSQDDCGDGS 

DEENCPVrVPTRVTTAAVlGSLICGLLLVIALG 

CTCKLYSLRMFERRSFETQLSRVEAELLRREA 

PPSYGQLIAQGLIPPVEDFPVCSPNQASVLENL 

RLAVRSQLGFTSVRLPMAGRSSNIWNRIFNFA 

RSRHSGSLALVSADGDEWPSQSTSREPERNH 

THRSLFSVESDDTDTENERRDMAGASGGVAA 

PLPQKVPPTTAVEATVGACASSSTQSTRGGH 

ADNGRDVTSVEPPSVSPARHQLTSALSRMTQ 

GLRWVRFTLGRSSSLSQNQSPLRQLDNGVSG 

REDDDDVEMLIPISDGSSDFDVNDCSRPLLDL 

ASDQGQGLRQPYNATNPGVRPSNRDGPCERC 

GIVHTAQIPDTCLEVTLKNETSDDEALLLC 


919 


2269 


A 


7951 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVAS 
GGWr^VACHTTMYFMCEFDKKNM 


920 


2270 


A 


7953 


47 


572 


GGRASWPEQAKEPRREGHTOKQQTEDVLAA 

GLRCLPHLPAICARRMSPAFRAMDVEPRAKG 

VLLEPFVHQVGGHSCVLRFNETTLCKPLVPRE 

HQFYETLPAEMRKFTPQYKGKSQLLEGLPHW 

RGDVRDRGHGRPWQPSLEPSLPPTLCFPSLSS 

FSSSWPSAQHLTPSVFNPW 


921 


2271 


A 


7957 


612 


812 


RSGRTWTGIGYSKALQSSNRNTKSLLQNEF 

MMVYSFRALSFKESTWATFQHGGEATKSRSL 

SSTQ 


922 


2272 


A 


7967 


1443 


1660 


ENrTEKWKEIWMCRGNKKSCCWTFIKDRHLT 
VSCCKSKSGETLLICIFCSNLVGFFFFGIRGFSN 
WELVKPN 


923 


2273 


A 


7981 


1 


3023 


GSAPRAATAMARARPPPPPSPPPGLLPLLPPLL 

I.LPLLI.LPAGCRALEETLMDTKWVTSELAV/T 

SHPESGWEEVSGYDEAMNPIRTYQVCNVRES 

SQNNWLRTGFIWRRDVQRVYVELKFTVRDC 

NSIPNTPGSCKETFNLFYYEADSDVASASSPFW 

MENPYVKVDTIAPDESFSRLDAGRVNTKVRS 

FGPLSKADFYLAFQDQGACMSLISVRAFYKK 

CASTTA GFALFPFTLTG AEPTSLVI APGTCIPN 

AVEVSVPLKLYCNGDGEWMVPVGACTCATG 

HEPAAKESQCRPCPPGSYKAKQGEGPCLPCPP 

NSRTTSPAASICTCHNNFYRADSDSADSACTT 

VPSPPRGVISNVNETSLILEWSEPRDLGVRDD 

LLYNVICKKCHGAGGASACSRCDDNVEFVPR 

QLGLSEPRVHTSHLLAHTRYTFEVQAVNGVS 

GKSPL PPRY AA VNITTNQAAPSEVPTLRLHSS 

SGSSLTLSWAPPERFNGVILDYEMKYFEKSEG 

IASTVTSQMNSVQLDGLRPDARYWQVRART 

VAGYGQYSRPAEFETTSERGSGAQQLQEQLP 

LIVGSATAGLVFWAVWIAIVCLRKQRHGS 

DSEYTEKLQQYIAPGMKVYIDPFTYEDPNEA 

VREFAKEIDVSCVKJEEVIGAGEFGEVCRGRL 

KQPGRREVFVAIKTLKVGYTERQRRDFLSEA 

SIMGQFDHPNIIRLEGWTKSRPVMILTEFME 

NCALDSFLRJLNDGQFTVIQLVGMLRGIAAGM 

KYLSEMNYVHRDLAARNILVNSNLVCKVSDF 

GLSRFLEDDPSDPTYTSSLGGKJPIRWTAPEAI 

AYRKFTSASDVWSYGrVMWEVMSYGERPY 
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WDMSNQDVINAVEQDYRLPPPMDCPTALHQ 

LMLDCWVRDRNLRPKFSQIVNTLDKL1RNAA 

SLKVIASAQSGMSQPLXDRTVPDYTTFTTVGD 

WLDA1KMGRYKESFVSAGFASFDLVAQMTA 

EDLLRJGVTLAGHQKKILSSIQDMRLQMNQT 

LPVQV 


924 


2274 


A 


7985 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNF 

QLMRELDQRTEDKKAEIDILAAEYISTVKTLS 

PDQRVERLQKIQNAYSKCKEYSDDKVQLAM 

QTYEMVDKrHRRLDADLARFEADLKDKMEG 

SDFESSGGRGLKKGRGQKEKRGSRGRGRRTS 

EEDTPKKKKHKGG 


925 


2275 


A 


7994 


447 


589 


LPCSFCAQCMSSFERVWLQQSHFHNPRWNSR 
SPIRCYCQHWPHCVHC 


926 


2276 


A 


7996 


925 


582 


GPCXVCCITLAIMLQCHSFYRKDVQVEHPKS 
LNPKYSQIENFLSADMALKRKCLLSISDLDFW 
IWDAQPVGIMQTLQNLKKIPNPGCFWSQAFQI 
RDTQPILPLGGRYYTTIRQ 


927 


2277 


A 


7998 


2 


353 


RIQRPLNSRSPNHSLFVKAELTAKQATMKLSV 
CLLLVTLALCCYQANAEFCPALVSELLDFFFI 
SEPLFKLSLAKFDAPPEAVAAKLGVKRCTDQ 
MSLQKRSLIAEVLVKILKKCSV 


928 


2278 


A 


8004 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAG 
A RGLRATYHRLLDK VELMLP EKLRPL YN H P A 
GPRTVFFWAPIMKWGLVCAGLADMARPAEK 
LSTAQSAVLMATGFIWSRYSLVITPKNWSLFA 
VNFFVGAAGASQLFRrWRYNQELKAXAHK 


929 


2279 


A 


8007 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYP 

AGSLLRQSPQPRHTFYAGPRLSASASSKELLM 

KLRRKTGYSFVNCKKALETCGGDLKQAEIWL 

HKEAQKEGWSKAAKLQGRKTKEGLIGLLQE 

GNTTVLVEVNCETDFVSRNLKFQLLVQQVAL 

GTMMHCQTLKDQPSAYSKGFLNSSELSGLPA 

GPDREGSLKDQLALAIGKLGENMILKRAAWV 

KVPSGFYVGSYVHGAMQSPSLHKLVLGKYG 

ALVICETSEQKTNLEDVGRRLGQHWGMAPL 

SVGSLDDEPGGEAETKMLSQPYLLDPS1TLGQ 

YVQPQGVSWDFVRFECGEGEEAAETE 


930 


2280 


A 


8008 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKEL 

GLVPLTDDTSHAGPPGPGRALLECDHLRSGV 

PGGRRRKDWSCSLLVASLAGAFGSSFLYGYN 

LSWNAFTPYIKAFYNESWERRHGRPEDPDTL 

TLLWSVTVS1FAIGGLVGTLIVKMIGKVLGRK 

HTLLANNGFAISAALLMACSLQAGAFEMLIV 

GRFIMGIDGGVALSVLPMYLSEISPKE1RGSLG 

QVTAIFICIGVFTGQLLGLPELLGKESTWPYLF 

GVIWPAWQLLSLPFLPDSPRYLLLEKHNEA 

RA VKAFQTFLGKAD VSQEVEEVL AES R VQRS 

IRLVSVLELLRAPYWWQVVTViVTMACYQL 

CGLNAIWFYTNSIFGKAGIPPAKIPYVTLSTGG 

IETLAAVFSGLVTEHLGRRPLLIGGFGLMGLFF 

GTLTITLTLQDHAPWVPYLSIVGILAIIASFCSG 

PGGIPFILTGEFFQQSQRPAAFnAGTVNWLSN 

FAVGLLFPFIQKSLDTYCFLVFATICITGArYL . 

YFVLPETKNRTYAEISQAFSKRNKAYPPEEKI 

DSAVTDGKINGRP 


931 


2281 


A 


8009 


861 


300 


AAGA WS AMPKAKGKTRRQKJ-'G YS VNRKRL 
NRNARRKAAPRIECSHIRHAWDHAKSVRQNL 
AEMGLAVDPNRAVPLRKRKVKAMEVDIEER 
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PKELVRKPYVLNDLEAEASLPEKKGNTLSRD 
LIDYVRYMVENHGEDYKAMARDEKNYYQD 
TPKQIRSKJNVYKRFYPAEWQDFLDSLQKRK 
MEVE 


932 


2282 


A 


8011 


412 


1 


SNLCLGNSWRWRWAKSRHHCIPTVTLSKRSG 
DIRGSHFSSPQRQRSQRVPGKETARVLRAGK 
QGRGQ1PIPCPWPPPPPPPPPGSPGPGCRQFHQ 
SLEAKARHPASVREMRGKVKMRRALRRAPA 
STRAS S RQPNPK 


933 


2283 


A 


8012 


147 


1077 


PPVPPASRSDMAQNLKDLAGRLPAGPRGMGT 

ALKLLLGAGAVAYGVRESVFTVEGGHRAIFF 

NRIGGVQQDTILAEGLHFRIPWFQYPIIYDIRA 

RPRKISSPTGSKDLQMVNISLRVLSRPNAQEL 

PSMYQRLGLDYEERVLPSIVNEVLKSWAKF 

NASQLITQRAQVSLLIRRELTERAKDFSLILDD 

VAITELSFSREYTAAVEAKQVAQQEAQRAQF 

LVEKAKQEQRQKIVQAEGEAEAAKMLGEAL 

SKNPGYIKLRJORAAQNISKTIATSQNRIYLTA 

DNLVLNLQDESFTRGSDSLIKGKK 


934 


2284 


A 


8023 


255 


982 


S QFSLSQ VL VDS AEEG SLAAAAEL AAQKREQ 

RlvRKFRELHL^lRNEARKLNHQEVVEEDKRL 

KXPANWEAKKARLEWELKEEEKKKECAARG 

EDYEKVKLLEISAEDAERWERKKKRKNFDLG 

FSDYAAAQLRQYHRLTKQIKPDMETYERLRE 

KHGEEFFPTSNSLLHGTHVPSTEEIDRMVEDLE 

KQIEKRDKYSRRRPYNDDADIDYINERNAKF 

NKKAERFYGKYTAEIKQNLERGTAV 


935 


2285 


A 


8027 


59 


310 


LVSSTVNLLTEKAPWNSLAWTVTSYVFLKFL 
QGGGTG STG MRDS ALTLLGIGPSHRHSLSIRL 
SQHSSPAPMYSQTFHILVLG 


936 


2286 


A 


8032 


1 


639 


SGRECNMAKTYDYLFKLLUGDSGVGKTCVL 

FRFSEDAFNSTFISTIGIDFKIRTTELDGKRIKLQ 

IWDTAGQERFRTITTAYYRGAMGIMLVYDIT 

NEKSFDNIRNWIRNIEEHASADVEKMILGKKC 

DVNDKRQVSKERGEKLALDYGIKFMETSAK 

ANrNVENAFFTLARDIKAKMDKKLEGNSPQG 

SNQGVKITPDQQKRSSFFRCVLL 


937 


2287 


A 


8039 


393 


311 


EETrHSENSYILEKYIPISANLTLTIA 


938 


2288 


A 


8052 


675 


1334 


LHP AATSTA WLHVPPGLSMALS WVLTVLSLL 

PLLEAQIPLCANLVPVPITNATLDRITGKWFYI 

/\SAFRNEEYNKSVQEIQATFFYFTPNKTEDTIF 

LREYQTRQDQCIYNTTYLNVQRENGTISRYV 

GG QEHF AHLLILRDTKTYML AFDVNDEKNW 

GLSVYADKPETTKEQLGEFYEALDCLRIPKSD 

WYTDWKKDKCEPLEKQHEKERKQEEGES 


939 


2289 


A 


8055 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDF 

AEQLKWSAELARLGESIMDGKQGGMDG SKP 

AGPRDFPGIRLLSNPLMGDAVSDWSPMHEAA 

IHGHQLSLRNLISQGWAVNIITADHVSPLHEA 

CLGGHLSCVKILLKHGAQVNGVTADWHTPL 

FNACVSGSWDCVNLLLQHGASVQPESDLASP 

IHEAARRGHVEC VNSLI AYGGNIDHKI SHLGT 

PLYLACENQQRACVXKLLESGADVNQGKGQ 

DSPLHAVARTASEELACLLMDFGADTQAKN 

AEGKRPVELVPPESPLAQLFLEREGPPSLMQL 

CRLRJRKCFGIQQHHKITKLVLPEDLKQFLLH 

L 


940 


2290 


A 


8058 


2 


1203 


KVL S IREPAHST ARKASEPS QPSQPSQPGGHLI 
ARLRTMDLHLFDYSEPGNFSDISWPCNSSDCI 
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VVDTVMCPNMPNKSVLLYTLSFIYIFIFV1GMI 

ANS V V V W VN 1QAJC1TG YDTHC YJL NL AJ ADL 

WWLTIPVWWSLVQHNQWPMGELTCKVTH 

LIF^mLFGSIFFLTCMSVDRYLSrTTFTNTPSS 

RKKMVRRVVCILVWLLAFCVSLPDTYYLKT 

VTSASNNETYCRSFYPEHSIKEWLrGMELVSV 

VLGFAVPFSIIAVFYFLLARAISASSDQEKHSS 

RKIIFSYWVFLVCWLPYHVAVLLDIFSILHYI 

PFTCRLEHALFTALHVTQCLSLVHCCVNPVL 

YSFINRNYRYELMKAFIFKYSAKTGLTKLIDA 

SRVSETEYSALEQSTK 


941 


2291 


A 


8059 


73 


432 


DMAGLMTIVTSLLFLGVCAHH1IPTGSVVLPS 
PCCMFFVSKRIPENRVVSYQLSSRSTCLKAGV 
IFTTKKGQQFCGDPKQEWVQRYMKNLDAKQ 
KKASPRARAVAVKGPVQRYPGNQTTC 


942 


2292 




8067 


278 


1262 


GGIGEIKORPSCLGRCLDPSLSVLMNISLGLGS 

VFSAVISQKPSRDICQRGTSLT1QCQVDSQVT 

MMFWYRQQPGQSLTL1ATANQGSEATYESGF 

VIDKPPISRPNLTFSTLTVSNMSPEDSSIYLCSA 

GRQGTYEQYFGPGTRLTVTEDLKNVFPPEVA 

VFEPSEAEISHTQKATLVCLATOFYPDHVELS 

WWVNGKEVHSGVSTDPQPLKEQPALNDSRY 

CISSRLRVSATFWQNPRNHFRCQVQFYGLSE 

NDEWTQDRAKP VTQrV S AE A WGRADCGFTS 

ESYQQGVLSATILYEILLGKATLYAVLVSALV 

LMAMVKRKDSRG 


943 


2293 


A 


8070 


1 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKI 

TASERLRRRPRATARLRAHAAPPEPPLAVFAP 

PSDRKELLALPVACDPVIASVMSWVQAASLI 

QGPGDKGDVFDEEADESLLAQREWQSNMQR 

RVKEG YRDGID AG KA VTLQQG FNQG YKKG A 

EVILNYGRLRGTLSALLSWCHLHNNNSTLINK 

INNLLDAVGQCEEYVLKHLKSITPPSHVVDLL 

DSIEDMDLCHV VPAEKK IDEAKDERLCENNA 

EFNKNCSKSHSG IDCSYVECCRTQEHAHS GK 

PKPHMDFGTDSQF 


944 


2294 


A 


8073 


1 


797 


ESARWSRQLRRTLIRLSFPISCGRSHAFGGCK 

MAATSGTDEPVSGELVSVAHALSLPAESYGN 

DPDIEMAWAMRAMQHAEVYYXLISSVDPQF 

LKLTKVDDQIYSEFRKNFETLRTDVLDPEELK 

SESAKEKWRPFCLK.FNG1VEDFNYGTLLRLD 

CSQGYTEENTIFAPR1QFFAIEIARNREGYNKA 

VYISVQDKEGEKGVNNGGEKRADSGEEENT 

KNGGEKGADSGEEKEEGINREDKTDKGGEK 

GKEADKEINKSGEKAM 


945 


2295 


A 


8074 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYL 
SADRRVLGLREWGRPASERECSLCQRLKREL 
NMGD VEKGKKJF IMKC SQCHTVEKGGKI IKT 
GPNLHGL FGRKTGQAPG YS YTAANKNKGIIW 
OEDTLMEYLENPKKYTPGTKMIFVGIKKKEER 
ADLIAYLKKATNE 


946 


2296 


A 


8081 


42 


590 


EGRRGKFGGKLCNFLFYFHSNSAESRMDVLF 

VAIFAVPLILGQEYEDEERLGEDEYYQWYY 

YTVTPSYDDFSADFTIDYSIFESEDRLNRLDK 

D1TEAIETTISLETARADHPKPVTVKPVTTEPQ 

SPRSEAMPCPVLRSPIPLPPVRVPLFRWGCISC 

KKVGRRLLMTLWMGVWQEEIGR 


947 


2297 


A 


8084 


322 


549 


GGGSSPRELAGAAGLTVTSQAVAARRQQPSF 
SRARAPAHSLRAALSLASSARSWGAVSRDRG 
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PCPPAIMYQSSNKC 


948 


2298 


B 


8093 


3905 


846 


MEPGEVKDRXLENISLSVKKLQSYFAACEDiil 

PAIRNHDKVLQRLCEHLDHALLYGLQDLSSG 

YWVLWHFTRREAIKQIEVLQHVATNLGRSR 

AWLYLALNENSLESYLRLFQENLGLLHKYYV 

KNALVCSHDHLTLFLTLVSGLEFIRFELDLDA 

PYLDLAPYMPDYYKPQYLLDFEDRLPSSVHG 

SDSLSLNSFNSVTSTNLEWDDSAIAPSSEDYD 

FGDVFPAVPSVPSTDWEDGDLTDTVSGPRST 

ASDLTSSKASTRSPTQRQNPFNEEPAETVSSS 

DTTPVHTTSQEKEEAQALDPPDACTELEVIRV 

TKKKKIGKKKKSRSDEEASPLHPACSQKKCA 

KQGDGDSRNGSPSLGRDSPDTMLASPQEEGE 

GPSSTTESSERSEPGLXIPEMKDTSMERLGQPL 

SKVIDQLNGQLDPSTWCSRAEPPDQSFRTGSP 

GDAPERPPLCDFSEGLSAPMDFYRFTVESPST 

VTSGGGHHDPAGLGQPLHVPSSPEAAGQEEE 

GGGGEGQTPRPLEDTTREAQELEAQLSLVRE 

GP VSEPEPGTQEVLCQLKRDQPSPCL S S AEDS 

GVDEGQGSPSEMVHSSEFRVDNNHLLLLMIH 

VFRENE E QLF KMI RMSTGHME G NLQL L YV LL 

TDCYVYLLRKGATEKPYLVEEAVSYNELDY 

VSVGLDQQTVKLVCTNRRKQFLLDTADVAL 

AEFFLASLKSAMIKGCREPPYPSILTDATMEK 

LALAKFV AQESKCEA S A VTVRF YGLVH WED 

PTDESLGFI'PCHCSPPEGTITKEGMLHYKAGT 

S YLGKEHWKTCF\ r VLSNGI LYQYPDRTDVI P 

LLS VNMGGEQCG GCRRANTTDRPHAFQ VIL S 

DPPCLELSAESEAEMAEWMQHLCQAVSKGVI 

PQGVAPSPCIPCCLVLTDDRLFTCHEDCQTSF 

FRSLGTAKLGDISAVSTEPGKEYCVLEFSQDS 

QQLLPPWVrYLSCTSELDRLLSALNSGWKTIY 

QVDLPHTAIQEASNKKKFEDALSLIHSAWQR 

SDSLCRGRASRDPWC* 


949 


2299 


A 


8095 


9 


2374 


ARRADTVLLESPSMLQGLLPVSLLLSVAVSAI 

KELPGVKKYEWYPIRLHPLHKREAKEPEQQ 

EQFETEEKYKMTINGKIAVLYLKKNKNLLAP 

GYTETYYNSTGKEITTSPQIMDDCYYQGHILN 

EKVSDASISTCRGLRGYFSQGDQRYFIEPLSPI 

HRDGQEHALFKYNPDEKNYDSTCGMDGVL 

WAHDLQQNIALPATKLVKLKDRKVQEHEKY 

IEYYLVLDNGEFKRYNENQDEIRKRVFEMAN 

Y VNML YKKLNTHV AL VGMEIWTDKDKIKIT 

PNASFTLENFSKWRGSVLSRRKRHD1AQLITA 

TELAGTTVGLAFMSTMCSPYSVGWQDHSD 

NLLR VAGTMAJ IEMGHNFGMFHDDYSCKCPS 

TICVMDKALSFYIPTDFSSCSRLSYDKFFEDKL 

SNCLFNAPLPTDIISTPICGNQLVEMGEDCDC 

GTSEECTNTCCDAKTaaKATF'QCALGECCEK 

CQFKKAGMVCRPAKDECDLPEMCNGKSGNC 

PDDRFQVNGFPCHHGKGHCLMGTCPTLQEQ 

CTELWGPGTEVADKSCYNRNEGGSKYGYCR 

RVDDTLIPCKANDTMCGKLFCQGGSDNLPW 

KGRIVTFLTCKTFDPEDTSQEIGMVANGTKCG 

DNKVCIN AECVDIEKA YKSTNCSSKCKGHA V 

CDHELQCQCEEG WTPPDCDDSS WFHFS rWG 

VLFPMAV1FVWAMVIRHQSSREK.QKKDQRP 

LSTTGTRPHKQKRKPQMVKAVQPQEMSQMK 

PHVYDU'VEGKEPPASFHKDTNALPPTVFKD 

NPMSTPKDSNPKA 
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950 


2300 


A 


8100 


1 


1251 


MGLLLMIL AS A VLG SFLTLLAQFFLL YRRQPE 

PPADEAARAGEGFRYIKPVPGLLLREYLYGG 

GRDEEPSGAAPEGGATPTAAPETPAPPTRETC 

YFLNATILFLFRELRDTALTRRWVTKKIKVEF 

EELLQTKTAGRLLEGLSLRDVFLGETVPFIKTI 

RLVRPWPSATGEPDGPEGEALPAACPEELAF 

EAEVEYNGGFHLAIDVDLVFGKSAYLFVXLS 

RWGRLRLVFTRVPFTHWFFSFVEDPLIDFEV 

RSQFEGRPMPQLTSir\^QLKKIIKRKHTLPNY 

KTRFKPFFPYQTLQGFEEDEEHTHIQQWALTE 

GRLKVTLLECSRLLrFGSYDREANVHCTLELS 

SSVWEEKQRSSIKTGnSLTAVFMGWHRVSE 

AFPGLWYKLLVDLPFWGLEDGGPLLTVPLRQ 

CPG 


951 


2301 


A 


8108 


1612 


839 


EVALFCFEMAAGMYLEH YLDS IENLPFELQR 

NFQLMRDLDQRTEDLKAEIDKLATEYMSSAR 

SLSSEEKLALLKQIQEAYGKCKEFGDDKVQL 

AMQTYEMVDKHIRRLDTDLARFEADLKEKQI 

ESSDYDSSSSKGKKKGRTQKEKKAARARSKG 

KKSDEEAPKTAQKKLKLVRTSPEYGMPSVTF 

GSVHPSDVLDMPVDPNEPTYCLCHQVSYGE 

MIGCDNPDCSIEWFHFACVGLTTKPRGKWFC 

PRCSQERKKK 


952 


2302 


A 


8112 


595 


291 


PSV A SI . ARRF SGR AL WPPSHS VPGNRAE CPRL 
LHGTTLPGGNQRELARQKNMKKQSDSVKGK 
RRDDGLS AAARKQRD STPRDSEIMQQKQKK 
ANEKKEEPK 


953 


2303 


A 


8118 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPG 

LETNILKMTTPNKTPPGADPKQLERTGTVREI 

GSQAVWSLSSCKPGFGVDQLRDDNXETYWQ 

SDGSQPHLVNIQFRRKTTVKTLCIYADYKSDE 

SYTPSKISVRVGNNFHNLQEIRQLELVEPSGW 

IHVPLTDNHKKPTRTFM1QIAVLANHQNGRD 

THMRQIKIYTPVEESSIGKFPRCTTIDFMMYRS 

IR 


954 


2304 


A 


8133 


66 


1015 


PPLPPRSFPNLFSRPEPLPEPGRRGCNRSREPA 

ARAPSPPPPFEGAPGRAMVKVTFNSALAQKE 

AKKDEPKSGEEALITPPDAVAVDCKDPDDW 

PVGQRRAWCWCMCFGLAFMLAGVTLGGAY 

LYKYFALQPDDVYYCGIKYTKDDVILNEPSAD 

APAALYQTIEENIKIFEEEEVEFISVPVPEFADS 

DPANTVHDFNKKLTAYLDLNLDKCYVIPLNT 

SIVMPPRNLLELLINIKAGTYLPQ S YLIHEHMV 

ITDRIENIDHLGFFIYRJLCf IDKETYKLQRRETI 

KGIQKREASNCFAIRHFEKKF AVETLIC S 


955 


2305 


A 


8143 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDV 

KKKJKEVTEEVANKVSCAMTDEICRLSVLVD 

EFCSEFHPNPDVLKJYKSELNKHIEDGMGRNL 

ADRCTDEVNALVLQTQQEIIENLKPLLPAG1Q 

DKLHTLIPCKKFDLSYNLNYHKLCSDF0ED1V 

FRFSLGWSSLVHRFLGPRNAQRVLLGLSEPIF 

QLPRS LASTPTAPTTPATPDNASQEELMITL VT 

GLASVTSRTSMGIUVGGVTWKTIGWKLLSVS. 

LTMYGALYLYERLSWTTHAKERAFKQQFVN 

YATEKLRMIVSSTSANCSHQVKQQIATTFARL 

COQVDITQKQLEEEIARLPKJilDQLEKIQNNS 

KLLRNKAVQLENELENFTKQFLPSSNEES 


956 


2306 


A 


8157 


1854 


798 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLH 
LFLLTAGPALGWNDPDRMLLRDVKALTLHY 
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DRYTTSRRLDPIPQLKCVGGTAGCDSYTPKVI 

QCQNKGWDGYDVQWHCKTDLD1AYKFGKT 

WSCEGYESSEDQYVLRGSCGLEYNLDYTEL 

GLQKLKESGKQHGFASFSDYYYKWSSADSC 

NMSQLITrvVLLGIAFWYKLFLSDGQYSPPP 

YSEYPFTSHRYQRFTNSAGFPPPGFKSEFTGPQ 

NTGHGATSGFGSAFTGQQGYENSGPGFWTGL 

GTGGILGYLFGSNRAATPFSDSWYYPSYPPSY 

PGTWNRAYSPLHGGSGSYSVCSNSDTKTRTA 

SGYGGTRRK 


957 


2307 


A 


8159 


1492 


528 


THVVTvfTGMCYAPHQVLSYINGVTTSKPGVSL 

VYSMPSRNLSLRLEGLQEKDSGPYSCSVNVQ 

DKQGKSRGHSIKTLELNVLVPPAPPSCRJLQGV 

PHVGANVTLSCQSPRSKPAVQYQWDRQLPSF 

QTFFAPALDVIRGSLSLTNLSSSMAGVYVCKA 

HNEVGTAQCNVTLE V STGPG AA WAGA WG 

TLVGLGLLAGLVLLYHRRGKALEEPANDIKE 

DA1APRTLPWPKSSDTISKNGTLSSVTSARAL 

RPPHGPPRPGALTPTPSLSSQALPSPRLPTTDG 

AHPQPISPIPGGVSSSGLSRMGAVPVMVPAQS 

QAGSLV 


958 


2308 


A 


8161 


2340 


1192 


ELARRPK QQ S SEKSRNMLRN WLTJLFILFPLKL V 

EKCESSVSLTVPPVVKLENGSSnvTVSLTLRPP 

LNATLVTTFEITFRSKNITILELPDEVVVPPGVT 

N S SFQ VTSQNVG QLTV YLHGNHSNQTGPRIR 

FLVIRSSA1SITNQVIGWIYFVAWSISFYPQVTM 

N WRRKS VI GL SFDFV ALNLTGF V A YSVFNIGL 

LWVPYIKEQFLLKYPNGVNPVNSNDVFFSLH 

AVVLTLIIIVQCCLYERGGQRVSWPAIGFLVL 

AWLFAFVTMIVAAVGVITWLQFLFCFSYIKL 

AVTLVKYFPQAYMNFYYKSTEGWSIGNVLL 

DFTGGSFSLLQMFLQSYNNDQWTLIFGDPTK 

FGLGVFSIVFDWFFIQHFCLYRKRPGYDQLN 


959 


2309 


A 


8163 


521 


1345 


GERAGRRRGRLGVWAQPQPLLPRPVGSRRE 

MQPPGPPPAYAPTNGDFTFVSSADAEDLSGS1 

ASPDVKLNLGGDF1KESTATTFLRQRGYGWL 

LEVEDDDPEDNKPLLEELDIDLKDIYYKIRCV 

LMPMPSLGFNRQVVRDNPDFWGPLAVVLFFS 

MISLYGQFRWSWIITIWIFGSLTEFLLARVLG 

GEVAYGQVLGVIGYSLLPLTVIAPVLLVVOSF 

EWSTLDCLFGVFWAAYSAASLLVGEEFKTK 

KPLLIYPIFLLYIYFLSLYTGV 


960 


2310 


A 


8167 


1 


2921 


MTCFKGQKGEQRSHAPEANKDHKAKVPSPN 

LYSQLNALQFTVDERSILWLNQFLLDLKQSL 

NQFMAVYKLNDNSKSDEHVDVRVDGLMLK 

FVTPSEVKSECHQDQPRAISIQSSEMIATNTRH 

CPNCRHSDLEALFQDFKDCDFFSKTYTSFPKS 

CDNFNLLHP1FQRHAHEQDTKMHEIYKGNITP 

QLNKNTLKTSAATDVWAVYFSQFWIDYEGM 

KSGKGRPISFVDSFPLSIWICQPTRYAESQKEP 

QTCNQVSLNTSQSESSDLAGRLJCRKKIXKEY 

YSTESEPLTNGGQKPSSSDTFFRFSPSSSEADI 

HLLVHVHKlIVSMQrNHYQYLLLLFLHESLILL 

SENLRKDVEAVTGSPASQTSICIGDLLRSAELA 

LLLHPVDQANTLKSPVSESVSPVVPDYLPTEN 

GDFLSSKRKQISRDINR1RSVTVNHMSDNRSM 

SVDLSHIPLKDPLLFKSASDTNLQKGISFMDY 

L SDKHLGKI SEDESSGL VYKSGSGEIGSETSD 

KKDSFYTDS SS VLNYREDSNILSFDSDGNQN I 

LSSTLTSKGNETIESIFKAEDLLPEAASLSENL 
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DISKEETPPVRTLKSQSSLSGKPKERCPPNLAP 

LCVSYKNMKRSSSQMSLDTISLDSMILEHQLL 

ESDGSDSHMFLEKGNKKNSTTNYRGTAESVN 

AGANLQNYGETSPDA1STNSEGAQENHDDLM 

SWVFKITGVNGE1DIRGEDTEICLQVNQVTP 

DQLGNISLRHYLCNRPVGSDQKAVIHSKSSPE 

ISLRFBSGPGAV1HSLLAEKNGFLQCHIENFST 

EFLTSSLMN1QHFLEDETVATVMPMKIQVSNT 

KINLKDDSPRSSTVSLEPAPVTVHIDHLWER 

SDDGSFHIRDSHMLNTGNDLKENVKSDSVLL 

TSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMAI , 

AEAHLEKDALLHHIKKMTVE 


961 


2311 


A 


8172 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVRMKRAGKRFEI 

ACYKNKWGWRSGVEKDLDEVLQTHSVFVN 

VSKGQVAKKEDLISAFGTDDQTEICKQILTKG 

EVQVSDKERHTQLEQMFRDIATIVADKCVNP 

ETKRPYTVIL IERAMKDIHY S VKTNKSTKQQ A 

LEVIKQLKEKMKJERAHMRLRFILPVNEGKKL 

KEKLKPLIKV1ESEDYGQQLEIVCLIDPGCFREI 

DELDCKETKGKGSLEVLNLKDVEEGDEKFE 


962 


2312 


A 


8175 


286 


587 


NISNKAEVSSHPSVISHSMDSFGQPRPEDNQS 
VLRRMQKKYWKTKQVFIKATGKKEDEHLVA 
SDAELDAKLEVFHSVQETCTELLKJIEKYQLR 
LNGMKS 


963 


2313 


A 


8181 


13 


2215 


AEGCAERAGTEPWELSMSWESGAGPGLGSQ 

GMDLVWSAWYGKCVKGKGSLPLSAHGIW 

AWLSRAEWDQVTVYLFCDDHKLQRYALNRJ 

TVWRSRSGNELPLAVASTADLIRCKLLDVTG 

GLGTDELRLLYGMALVRFVNLISERKTKJFAK 

VPL KCLA QE VN J PD W] VDLRHEL THKKMPHI 

NDCRRGCYFVLDWLQKTYWCRQLENSLRET 

WELEEFREGIEEEDQEEDKNIWDDITEQKPE 

PQDDGKSTESDVKADGDSKGSEEVDSHCKK 

ALSHKELYERARELLVSYEEEQFTVLEKFRYL 

PKADCAWNNPSPRVECVLAELKGVTCENREA 

VLDAFLDDGFLVPTFEQLAALQIEYEENVDL 

NDVLVPKPFSQFWQPLLRGLHSQNFTQALLE 

RMLSELPALGISGIRPTY1LRWTVELIVANTKT 

GRNARRFSAGQWEARRGWRLFNCSASLDWP 

RMVESCLGSPCWASPQLLRUFKAMGQGLPD 

EEQEKLLRICSIYTQSGENSLVQEGSEASPIGK 

SPYTLDSLYWSVKPASSSFGSEAKAQQQEEQ 

GSVNDVKEEEKEEKEVLPDQVEEEEENDDQE 

EEEEDEDDEDDEEEDRMEVGPFSTGQESFTA 

ENARLLAQKRGALQGSAWQVSSEDVRWDTF 

PI ,GRMPGQTEDP AELMI .ENYDTMYLLDQPV 

LEQRLEPSTCKTDTLGLSCGVGSGNCSNSSSS 

NFEGLLWS QGQLHGLKTGLQLF 


964 


2314 


A 


8184 


6 


1393 


EPRRNFRDDSTRPRTRGRTRGRRRRACRSAE 

GTGLRSLLLPPRLQLPAGPFSRCRWDPVSSPR 

PSTMPPKKGGDGIKPPPIIGRFGTSIJeiGrVGLP 

NVGKSTFFNVLTNSQASAENFPFCTIDPNESR 

VPVPDERFDFLCQYHKPASKIPAFLNVVDIAG 

LVKGAHNGQGLGNAFLSHISACDGIFHLTRA 

FEDDDITHVEGSVDPIRDIEirHEELQLKDEEMI 

GPIIDKLEKVAVRGGDKKLKPEYDIMCKVKS 

WVIDQKKPVRFYHDWNDKEIEVLNKHLFLTS 

KPMVYLVNLSEKDYIJUCKNKWLIKIKEWVD 

KYDPGALVIPFSGALELKLQELSAEERQKYLE 
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ANMTQSALPKIIKAGFAALQLEYFFTAGPDEV 
RAWTIRKGTKAPQAAGK1HTDFEKGFIMAEV 
MK YEDFKEE G SEN A VKAAGKYRQQGRNYIV 
EDGDIIFFKFNTPQQPKKK 


965 


2315 


A 


8195 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMV 
AAALGGHPLLGVSATLNSVLNSNAIKNLPPPL 
GG AAGHPGS A VSAAPGIL YPG GNK YQTIDNY 
QPYPCAEDEECGTDEYCASPTRGGDAGVQIC 
LACRKRRKRCMRHAMCCPGNYCKNGICVSS 
DQNHFRG EIEETITE S FG NDHSTL DG Y SRRTT 
LSSKMYHTKGQEGSVCLRSSDCASGLCCARH 
FWSKICKP\ r LKEGQVCTKHRRK 1 GSHGLEIFQ 
RCYCGEGLSCRIQKDHHQASN S SRLHTCQRH 


966 


2316 


A 


8207 


416 


4082 


KFKLIKJMLLTLIILLPVVSKFSFVSLSAPQHW 

SCPEGTLAGNGNSTCVGPAPFLIFSHGNSIFRI 

DTEGTNYEQLVVDAGVSVIMDFHYNEKRIY 

WVDLERQLLQRVFLNGSRQERVCNIEKNVSG 

MAIN WINEEVI W SNQQEGIITVTDMKGNN SHI 

LLSALKYPANVAVDPVERFIFWSSEVAGSLY 

RADLDGVGVKALLETSEKITAVSLDVLDKRL 

FWIQYNREGSNSLICSCDYDGGSVH1SKHPTQ 

HNLFAMSLFGDRIFYSTWKMKTIWIANKHTG 

KDMVRINT /HSSFVPT .GELKVVHPLAQPKAED 

DTWEPEQKLCKLRKGKCSSTVCGQDLQSHLC 

MCAEGYALSRDRKYCEGNDWKYCEDVNEC 

AFWNHGCTLGCKNTPGSYYCTCPVGFVLLPD 

GKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPL 

SPVSWECDCFPGYDLQLDEKSCAASGPQPFL 

LFANSQDIRHMHFDGTDYGTLLSQQMGMVY 

ALDHDP VENKJYF AHTALECWI ERANMDGSQ 

RERLIEEGVDVPEGLAVDW1GRRFYWTDRGK 

SLIGRSDLNGKRSK1ITIENISQPRGIAVHPMAK 

RLFWTDTGINPRIESSSLQGLGRLVIASSDLIW 

PSGITIDFLTDKLYWCDAKQSVIEMANLDGSK 

RRRLTQND VGHPF A V A VFED Y VWF SD WAMP 

SVIRVNKRTGKDRVRLQGSMLKPSSLVWHP 

LAKPGADPCLYQNGGCEHICKKRLGTAWCS 

CREGFMKASDGKTCL ALDGHQLLAG GE VDL 

KNQVTPLDILSKTRVSEDN1TESQHMLVAEIM 

VSDQDDCAPVGCSMYARCISEGEDATCQCLK 

GFAGDGKLCSDIDECEMGVPVCPPASSK.CINT 

EGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPD 

STPPPHLREDDHHYSVRNSDSECPLSHDGYCL 

HDG VCM YIEALDKY ACN C W G YIGERCQYR 

DLKWWELRHAGHGQQQKVIVVAVCVW1.V 

MLLLLSLWGAHYYRTQKXLSKNPKNPYEESS 

RDVRSRRPADTEDGMSSCPQPWFWIKEHQD 

LKNGGQPVAGEDGQAADGSMQPTSWRQEPQ 

LCGMGTEQGCWIPV S SDKGSCPQ VMERSFH 

MPS YGTQTLEGG VEKPHSLLS ANPL WQQRAL 

DPPHQMELTQ 


967 


2317 


A 


8210 


3 


601 


SSAMGSRSSHAAV1PDGDSIRRETGFSQASLL 

RLHHRFKAJLDRNKKGYLSRMDLQQIGALAV 

NPLGDRIIESFFPDGSQRVDFPGFVRVLAHFRP 

VEDEDTETQDPKKPEPLNSRRNKLHYAFQLY 

DLDRDGKI SRHEMLQ VLRL MV G VQ VTEE QL 

ENIADRTVQEADEDGDGAVSFVEFTKSLEKM 

DVEHKMSDRiLK 
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968 


2318 


A 


8211 


2 


409 


I S SCPHT A YEG SMSTLSNFTQTLED VFRR1F IT 

YMDNWRQNTTAEQEALQAKVDAENFYYVIL 

YLNfVMIGNffSFITVAILVSTVKSFCRREHSNDP 

YHQYIVEDWQEKYKSQtLKLEESKATTHENIG 

AAGFKMSP 


969 


2319 


A 


8215 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWR 

VPGRLLLL LLP ALCCLPGAARAAAAAAG AGN 

RAAVAVAVARADEAEAPFAGQNWLKSYGY 

LLFYDSRASALHSAKALQSAVSTMQQFYGIP 

VTGVLDQTTIEWMKKPRCGVPDHPHLSRRRR 

NKRYALTGQKWRQKHITYSIHNYTPKVGELD 

TRKAIRQAFDVWQKVTPLTFEEVPYHEIKSDR 

KEADIM1FFASGFHGDSSPFDGEGGFLAHAYF 

PGPG1GGDTHFDSDEPWTLGNANHDGNDLFL 

VAVHELGHALGLEHSSDPSAIMAPFYQYMET 

HKFKLPQDDLQGIQK1YGPPAEPLEPTRPLPTL 

PVRRIHSPSERKHERQPRPPRPPLGDRPSTPGT 

KPNICLXjNFNTVALFRGEMFVFKDRWFWRL 

RNNRVQEGYPMQIEQFWKGLPARIDAAYER 

ADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALR\VEPVGKTYFFKGERY 

WRYSEERRATDPGYPEO?ITVWKGIPQAPQGA 

FISKEGYYTYFYKGRDYWKFDNQKLSVEPGY 

PRNILRDWMGCNQKEVERRKERRLPQDDVDI 

M\TINDVPGSVNAVAWIPCILSLCrLVLVYn 

FQFKNKTGPQPVTYYKRPVQEWV 


970 


2320 


A 


8216 


1235 


2223 


S RLSL QF YV S FRRTGLFTC KL IVE IFFRNY MN 

DSLRTNVFVRFQPETIACACIYLAARALQIPLP 

TRPHWFLLFGTTEEEIQEICIETLRLYTRKKPN 

YELLEKEVEKRKVALQEAKLKAKGLNPDGTP 

ALSTLGGFSPASKPSSPREVKAEEKSPISINVK 

TVKKEPEDRQQ ASK.SPYNG VRKDSKRS RNS R 

SASRSRSRTOSRSRSHTPRRHYNNRRSRSGTY 

SSRSRSRSRSHSESPRRHHNHGSPHLKAKHTR 

DDLKS SNRHGHKRKKSRSRS QSKSRDHSD AA 

KXHRHERGHHRDRRERSRSFERSHKSKHHGG 

SRS GHGRHRR 


971 


2321 


A 


8217 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAE 

RTEAPGTPEGPEPERPSPGDGNPRENSPFLNN 

VEVEQESFFEGKNMALFEEEMDSNPMVSSLL 

NKLANYTNLSQGVVEHEEDEESRRREAKAPR 

MGTFIG VYLPCLQNIL GVILFLRLTW IVGVAG 

VLESFLIVAMCCTCTMLTAISMSAIATNGVVP 

AGGSYYMISRSLGPEFGGAVGLCFYLGTTFA 

GAMYILGT1EIFLTYISPGAAIFQAEAAGGEAA 

AVLHNMRVYOTCrLVLMALVVFVGVKYVN' 

FCLALVFLACWLSILAIYAGVIKSAFDPPD1PV, 

CLLGNRTLSRRSFDACVKAYGIHNNSATSAL * 

WGLFCNGSQPSAACDEYFIQNNVTEIQGIPGA 

ASGVFLENLWSTYAHAGAFVEKKGVPSVPV 

AEESRASTLP YVLTDI AASFTLL VGI Y FPSVTG 

IMAGSNRSGDLKDAQK^IPTGTILAIVTTSFIY 

LSCrVLFGACIEGWLRDKFGEALQGN'LVIGM 

LAWPSPWVIVIGSFFSTCGAGLQTLTGAPRLL 

Q AI ARDGIVPFLQVFGHGKANGEPTW ALL LT 

VLICETGILIASLDSVAPILSMFFLMCYLFVNL 

ACAVQTLLRTPNWRPRFKTYHWTLSFLGMSL 

CLALMFICSWYYALSAMLIAGCIYKY1EYRG 

AEKEWGDGTRGLSLNAARYALI.RVEHGPPHT 

KNWRPQVLVMLNLDAEQAMKHPRLLSFTSQ 



283 



Printed from Mimosa 03/03/06 11:12:02 Page: 284 



WO 01/57188 



PCT/U SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, 
D^Aspartic Acid, E=Glutamic Acid, 
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LKAGKGLTIVGSVLEGTYLDKHMEAQRAEE 

N1RSLM STEKTKGFCQLV VSSSLKJXSMSHJLI Q 

SAGJLGGLKHNTVLMAWPASWKQEDNPFSW 

KNFVDTVRDTTAAHQALLVAKNVDSFPQNQ 

ERFGGGHI D VWW1 VHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMF 

LYHLRJSAEVEVVEMVENDISAFTYERTLMM 

EQRSQMLKQMQLSKNEQEREAQLIHDIWTAS 

HT A AAARTQ APPTPDK VQMT WTREKL I AEK 

YRSRDTSLSGFKDLFSMKPDQSNVRRMHTAV 

KLNGWLNKSQDAQLVLLNMPGPPKNRQGD 

ENYMEFLEVLTEGLNRVLLVRGGGREVITIYS 


972 


2322 


A 


8224 


701 


246 


TSRRVTMKFNPFVTSDRSIO^RKRHFNAPSHV 

RRKJMSSPLSKELRQKYNVRSMPIRKDDEVQ 

WRGHYKGQQIGKWQVYRXKYVIYIERVQ 

REKANGTTVHVGIHPSKVVrrRLKLDKDRKKJ 

LERKAKSRQVGKEKGKYKEELIEKMQE 


973 


2323 


A 


8237 


873 


4610 


GCPHAGGKGRVPTGGLTGGRTWSPSAAPRSC 

PRPGPTPAPGAMDKLPPSMRKRLYSLPQQVG 

AKAWIMDEEEDAEEEGAGGRQDPSRRSrRLR 

PLPSPSPSAAAGGTESRSSALGAADSEGPARG 

AGKSSTNGDCRRFRGSLASLGSRGGOSGGTG 

SGSSHGHLHDSAEERRLIAEGDASPGEDRTPP 

GLAAEPERPGASAQPAASPPPPQQPPQPASAS 

CEQPSVDTA1KVEGGAAAGDQILPEAEVRLG 

QAGFMQRQFGAMLQPGVNKFSLRMFGSQKA 

VEREQERVKSAGFW1IHPYSDFRFYWDLTML 

L LMV GNL1 II PVG ITFFKDENTTPWI VFN W SD 

TFFLIDLVLNFRTGIWEDNTEnLDPQRIKMK 

YLKSWFMVDFISSIPVDYIFLIVETRJDSEVYK 

TARALRIVRFTKILSLLRLLRLSRLIRYrHQWE 

EIFHMTYDLASAWRTVNLIGMMLLLCmVDG 

CLQFLVPMLQDFPDDCWVSINNMVNNSWGK 

QYSYALFKAMSIIMLCIGYGRQAPVGMSDV 

WLTMLSMI VG ATC Y AMFIGHAT ALIQSLD S S 

RRQYQEKYKQVEQYMSFHKLPPDTRQRIHD 

Y YEHR YQGKMFDEES1LGELSEPLREEITNFN C 

RKLVASMPLFANADPNFVTSMLTKLRFEVFQ 

PGDYIIREGTIGKKMYFIQHGVVSVLTKGNKE 

TKJLADGSYFGEICLLTRGRRTASVRADTYCR 

LYSLSVDNFNEVLEEYPMMRRAFETVALDRL 

DR1GKKN SILLHKVQHDLNSGVFNYQENEIIQ 

QIVQHDREMAHCAHRVQAAASATPTPTPVIW 

TPLIQAPLQAAAATTSVAIALTHHPRLPAAIFR 

PPPGSGLGNLGAGQTPRHLKRLQSLIPSALGS 

ASPASSPSQVDTPSSSSFHIQQLAGFSAPAGLS 

PLLPSSSSSPPPGACGSPSAPTPSAGVAATTIA 

GFGHFHKALGGSLSSSDSPLLTPLQPGARSPQ 

AAQPSPAPPGARGGLGLPEHFLPPPPSSRSPSS 

SPGQLGQPPGELSLGLATGPLSTPETPPRQPEP 

PSL VAGASGGASPVGFTPRGGLSPPGI rSPGPP 

RTFPSAPPRASGSHGSLLLPPASSPPPPQVPQR 

RGTPPLTPGRLTQDLKLISASQPALPQDGAQT 

LRRASPHSSGESMAAFPLFPRAGGGSGGSGSS 

GGLGPPGRPYGAIPGQHVTLPRKTSSGSLPPP 

LSLFGARATSSGGPPLTAGPQREPGARPEPVR 

SKLPSNL 


974 


2324 


A 


8247 


279 


468 


EYKQWERRFLSCQNRNDLGYGKPRKGGGLL 
E VTVXD A SRTCSLTYLLGSHWNNL VVRSP VL 
G 
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975 


2325 


A 


8249 


62 


1571 


LVALKNWKPKGTNIPAFQSPVFGEAVSGVYM 

MTKVLGMAPVLGPRPPQHQVGPLMVKVEEK 

EEKGKYLPSLEMFRQRFRQFGYHDTPGPREA 

LSQLRVLCCEWLRPEIHTKEQILELLVLEQFLT 

ILPQELQAWVQEHCPESAEEAVTLLEDLEPJEL 

DEPGHQVSTPFNEQKPVWEKISSSGTAKESPS 

SMQPQPLETSHKYESWGPLYIQESGEEQEFAQ 

DPRKVRDCRLSTQHEESAJDEQKGSEAEGLKG 

DUSVILANKPEASLERQCVNLENEKGTKPPLQ 

EAGSKKGRESVPTKPTPGERRY1CAECGKAFS 

NSSNLTKHRRTHTGEKPY VCTKCGKAF SHSS 

NLTLHYRTHLVDRPYDCKCGKAFGQSSDLLK 

HQRMHTEEAPYOCKDCGKAFSGKGSLIRHYR 

IHTG EKPYQCNECGKS FSQHAGLSSHQRLHT 

GEKPYKCKLECGKAFNHSSNFNKHHRIHTGEK 

PYWCHHCGKTFCSKSNLSKHQRVHTGEGEA 

P 


976 


2326 


A 


8257 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLE 

VAWPLFIFLILISVRLSYPPYEQHECHFPNKAM 

PSAGTLPWVQGIICNANNPCFRYPTPGEAPGV 

VGNFNKS I V ARLF SD ARRLLL YSQKDTSMKD 

MRKVLRTLQQIKKSSSNLKLQDFLVDNETFS 

GFLYHKLSLPKSTVDKMLRADVTLHK.VFLQG 

YQLHLTSI-CNGSKSEEMIQLGDQEVSELCGLP 

REKLAAAERVLRSNMDILKPILRTLNSTSPFPS 

ICELAEATKTLLHSLGTLAQELFSMRSWSDMR 

QEVMFLTNVNSSSSSTQIYQAVSRIVCGHPEG 

GGLKDCSLNWYEDNNYKALFGGNGTEEDAE 

TFYDNSTTPYCNDLMKNLESSPLSRirWKALK 

PJLLVGKILYTPDTPATRQVMAEVNKTFQELA 

VFHDLEGMWEELSPKIWTFMENSQEMDLVR 

MLLDSRDNDHFWEQQLDGLDWTAQDIVAFL 

AKHPEDVQSSMGSVYTWREAFNETNQAIRTIS 

RFMECVNLNKLEPIATEVWLJNKSMELLDER 

KFW AG IVFTGITPG S IELPHHVKYKIRMGIDN 

VERTNKIKDGY WDPGPRADPFEDMRYVWG G 

FAYLQDVVEQAIIRVLTGTEKKTGVYMQQMP 

YPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 

I1KGIVYEKEARLKETMRIMGLDNSILWFSWFI 

SSLIPLLVSAGLLWILKLGNLLPYSDPSWFV 

FL SVF A WTILQCFUSTLFSRANLAAACGGn 

YFTLYLPYVLCVAWQDYVGFTLK.IFASLLSP 

VAFGFGCEYFALFEEQGIGVQWDNLFESPVE 

EDGFNLTTSVSMMLFDTFLYGVMTWY1EAVF 

PGQYGIPRPWYFPCTKSYWFGEESDEKSHPGS 

NQKJRJSEICMEEEPTHLKLGVSIQNLVKVYRD 

GMKVAVDGLALNFYEGQITSFLGHNGAGKT 

TTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQ 

NLGVCPQHNVLFDMLTVEEHIWFYARLKGLS 

EKHVKAEMEQMALDVGLPSSKLKSKTSQLS 

GGMQRKLSVALAFVGGSKWILDEPTAGVDP 

YSRRGIWELLLKYRQGRTIILSTHHMDEADVL 

GDRLAI1SHGKLCCVGSSLFLKNQLGTGYYLT 

LVKKD VES SLSSCRNSSSTVSYIKKEDSV SQS 

SSDAGLGSDHESDTLT1DVSAISNLIRKHVSEA 

RLVEDIGHELTYVLPYEAAKEGAFVELFHEID 

DRLSDLGISSYGISETTLEEIFLKVAEESGVDA 

ETSDGTLPARRNRRAFGDKQSCLRPFTEDDA 

ADPNDSDIDPESRJETDLLSGMDGKGSYQVKG 

WKLTQQQFVALLWKRLL1ARRSRKGFFAQIV 
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LPAVFVCIALVFSLIVPPFGKYPSLELQPWMY 

NEQYTFVSNDAPEDTGTLELLNALTKDPGFG 

TRCMEGNPIPDTPCQAGEEEWTTAPVPQTIM 

DLFQNGNWTMQNPSPACOCSSDKIKKMLPV 

CPPG AGGLPPPQRKQNTADIL QDLTGRN I SD Y 

LVKTYVQUAKSLKNKIWVNEFRYGGFSLGVS 

NTQALPPSQEVNDATKQMKKHLKLAKDSSA 

DRFLNSLGRFMTGLDTRNNVKVWFNNKG W 

HAISSFLN VINN AILRAN1 .QKGENPSHYGITAF 

NHPLNLTK QQLS E V APMTTS VD VL VS I CV IF A 

MSFVPASFVVFLIQERVSKAKHLQFISGVKPVI 

YWLSNFVWDMCNYWPATLVIIIFICFQQKSY 

VSSTNLPVLALLLLLYGWS1TPLMYPASFVFK 

rPSTAYVVLTSVNLFIGINGSVATFVLELFTDN 

KLNNINDILKSVFLIFPHFCLGRGLIDMVKNQ 

amadalerfgenrfvsplswdlvgrnlfam 

avegwfflitvliqyrffirprpvnaklspln 

dededvrrerqrjldgggqndile1keltkiy 

rrkrbcpavdrjcvgippgecfgllgvngagk: 

s stfkmltgdttvtrgdaflnrn silsnihe v 

hqnmgycpqfda1telltgrehveffallrg 

vpekevgkvgewairklglvkygekyagny 

sggnkrklstamaliggppwfldepttgmd 

pkarrflwncalswkegrswltshsmeec 

e alctrmaim vngrfrclgs vqhlknrfgd 

g ytiwriagsnpdlkp vqdffglafpgs vpk 

ekhrnmlqyqlpsslsslarifsilsqskkrlh 

iedysvsqttldqvfvnfakdqsdddhlkdl 

slhknqtwd v avltsflqdekvkes y v 


977 


2327 


A 


8260 


3 


1567 


IPGSTISFSLCFIFPPCVPTMVRKPVVSTISKGG 

YLQGNVNGRLPSLGNKEPPGQEKVQLKRKV 

TLLRGVSIIIGTJIGAGIFISPKGVLQMTGSVGM 

SLTIWTVCGVLSLFGALSYAELGTTIKKSGGH 

YTYTLEVFGPLPAFVRVWVELLIIRPAATAV1S 

LAFGRYILEPFFIQCEIPELAIKLITAVGITWM 

VLNSMSVSWSARIQIFLTFCKLTAlLIirVPGV 

MQLIKGQTQNFKDAFSGRDSS1TRLPLAFYYG 

MYAYAGWFYLNFVTEEVLTslPEKTIPLAICISM 

AIVTIGYVLTNVAYFTTINAEELLLSNAVAVT 

FSERLLGNFSLAVPIFVALSCFGSMNGGVFAV 

SRLFYVASREGHLPErLSMIHVRKHTPLPAVIV 

LHPLTMTMLFSGDLDSLLNFLSFARWLFIGLA 

VAGLrYLRYKCPDMHRPFKVPLFIPALFSFTC 

LFMV ALSLY SDPFSTG1GFVITLTG VPAYYLF1I 

WDKKPRWFRIMSEKJTRTLQIILEWPEEDKL 


978 


2328 


A 


8261 


2 


2165 


RGG SLRC VLGKLLGQLLCFQSERCVRFPEGLL 

RHRGCGLLSSRLSAGKPPLRTSFFGSWGVLPP 

LADAASMSGVRAVRISIESACEKQVHEVGLD 

GTETYLPPLSMSQNLARLAQRJDFSQGSGSEE 

EEAAGTEGDAQEWPGAGSSADQDDEEGWK 

FQPSLWPWDSVRNNLRSALTEMCVLYDVLSI 

VRDKKFMTLDPVSQDALPPKQNPQTLQLISK 

KKSLAGAAQILLKGAERLTKSVTENQENKLQ 

RDFNSELLRLRQHWKLRKVGDKILGDLSYRS 

AGSLFPHHGTFEV1KNTDLDLDKKIPEDYCPL 

DVQIPSDLEGSAYIKVS1QKQAPDIGDLGTVN 

LFKRPLPKSKPG SPHWQTKLE AAQNVLLCKEI 

FAQLSREAVQIKSQVPHIVVKNQIISQPFPSLQ 

L SI SLCH SSNDKKS QKF ATEKQCPEDHL YVLE 

HNLHLLIREFHKQTLSSIMMPHPASAPFGHKR 



286 



Printed from Mimosa 03/03/06 11:12:05 Page: 287 



WO 01/57188 



PCT/U SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
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MRLSGPQAFDKNEINSLQSSEGLLEK1IKQAK 

HIFLRSRAAATTDSLASRIEDPQIQAHWSNIND 

VYESSVKVLITSQGYEQICKSIQLQLNIGVEQI 

RVVHRDGRVITLS YQEQELQDFLLSQM SQHQ 

VHAVQQLAKVMGWQVLSFSNHVGLGPIESIG 

NASAITVASPSGDYAISVRNGPESGSKIMVQF 

PRNQCKDLPKSDVLQDNKWSHLRGPFKEVQ 

WNKMEGRNFVYKMELLMSALSPCLL 


979 


2329 


A 


8289 


2 


1053 


FVWNPRGGRKRRRQAAVTQAATRASGTPSP 

RDGTMTQGKLSVANXAPGTEGQQQVHGEKK 

EAPAVPSAPPSYEEATSGEGMKAGAFPPAPTA 

VPLHPSWAYVDPSSSSSYDNGFPTGDHELFTT 

FSWDDQKVRRVFVRKVYTILLIQLLVTLAVV 

ALFTFCDPVKDYVQANPGWYWASYAVFFAT 

YLTLACCSGPRRHFPWNLILLTVFTLSMAYLT 

GMLSSYYNTTSVLLCLGITALVCLSVTVFSFQ 

TKFDFTSCQGVLFVLLMTLFFSGL1LAILLPFQ 

YVPWLHAVYAALGAGVFTLFLAJLDTQLLMG 

>1RRHSLSPEEY1FGALNIYLD1IYIFTFFLQLFG 

TNRE 


980 


2330 


A 


8305 


59 


857 


ASQLPDYSISPPSLPPRISFHPSPTLARVAMAEP 

SEATQSHSI S S S SFGAEPS APGGGG SPG ACP AL 

GTKSCSSSCAVHDLTPWRDVKKTGFVFGTTLI 

MLLSLAAFSVISWSYLILALLSVTISFRIYKSV 

IQA VQKSEEGHPFKA YLD VDITLS SEAFHN Y 

MNAAMVHINRALKLI1RLKLVEDLVDSLKLA 

VFMWLMTYVGAVFNGITLLILAELLIFSVPIV 

YEKYKTQIDHYVGIARDQTKSIVEKIQAKLPG 

IAKKKAE 


981 


2331 


A 


8308 


186 


1337 


TRM SRHEG VSCD ACLKGNFRGRRYKCLIC Y D 
YDLCASCYESGATTTRHrrDHPMQCDLTRVD 
FDLYYGGEAFSVEQPQSFTCPYCGKMGYTET 
SLQEHVTSEHAETSTEV1CPICAALPGGDPNH 
VTDDFAAHLTLEHRAPRDLDESSGVRIIVRR 
MFHPGRGLGGPRARRSNMHFTSSSTGGLSSS 
QSSY SPSNREAMDPIAELLSQLSGV RRS AGGQ 
LNSSGPSAS QLQQLQMQLQLERQHAQ AARQ 
QLET ARN ATRR I1MTS S V'lTTITQ STATTN1 AN 
TESSQQTLQNSQFLLTRLNDPKM SETERQSM 
ESERADRSLF VQELLLSTL VREES SS SDEDDR 
GEMADFGAMGCVDIMPLDVALENLNLKESN 
KGNEPPPPPL 


982 


2332 


A 


8315 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLV 

AAALLVGFILFLTRSRGRAASAGQEPLHNEEL 

AGAGRVAQPGPLEPEEPRAGGRPRRRRDLGS 

RLQAQRRAQRVAWAEADENEEEAVILAQEE 

EGVEKPAETHLSGKIGAKKLRKLEEKQARKA 

QREAEEAEREERKRLESQREAEWICKEEERLR 

LEEEQKEEEERKAREEQAQREHEEYLKLLKEA 

FWEEEGVGETMTEEQSQSFLTEFrNYlKQSK 

VVLLEDLASQVGLRTQDTINR1QDLLAEGTIT 

GVIDDRGKFiYTrPEELAAVANFIRQRGRVSIA 

ELAQASNSLIAWGRESPAQAPA 


983 


2333 


A 


8320 


244 


1420 


RRRWRARGGL VPTL A WAEATG A YVPGRDKP 
DLPTWKRNFRSALNRKEGLRLAEDRSKDPHD 
PHKIYEFVNSGVGDFSQPDTSPDTNGGGSTSD 
TQEDILDELLGNMVLAPLPDPGPPSLAVAPEP 
CPQPLRSPSLDNPTPFPNLGPSENPLKRLLVPG 
EF.WEFEVTAFYRGRQVFQQTISCPEGLRLVGS 
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i 








EVGDRTLPGWPVTLPDPGMSLTDRGVMSYV 

RHVLSCLGGGLALWRAGQWLWAQRLGHCH 

TYWAVSEELLPNSGHGPDGEVPKDKEGGVF 

DLGPFIVGSLGPPDLITFTEGSGRSPRYALWFC 

VGES WPQDQPWTKRL VMVKVVPTCLRAL VE 

MARVGGASSLENTVDLHISNSHPLSLTSDQY 

KAYLQDLVEGMDFQGPGES 


984 


2334 


A 


8321 


1 


1243 


ANMAPVEHWADAGAFLRHAALQDIOKNIY 

TIREVVTEIRDKATRRRLAVLPYELRFKEPLPE 

YVRLVTEFSKKTGDYPSLSATDIQVLALTYQL 

PAP PVr.V T-TT KnRPnifVIfVW?TnUPFTPI TJTC 
Ll*r\X2,r VVJ V OllL-IVV^Lli V ^» V oOulV^nrE i rLTLlO 

GFHLPYKPKPPQETEKGHSACEPENLEFSSFM 

FWRNPLPNIDHELQELLIDRGEDVPSEEEEEEE 

NGFEDRKJDDSDDDGGGWTTPSNIKQIQQELE 

QCD VPE D VR VGCLTTDF AMQN VLL QMGLHV 

I , A VNGMI IRE ARSYILRCHGCFKTTSDMSRV 

FCSHCGNKTLKKVSVTVSDDGTLHMHFSRNP 

KVLNPRGLRYSLPTPKGGKYATNPHLTEDQRF 

PQLRLSQKARQKTNVFAPDY1AGVSPFVENDI 

S SRS ATLQ VRDSTLGAGRRRLNPNASRKKFV 

KKR 


985 


2335 


A 


8322 


352 


529 


RRNNIRQFuMKVCISGQARWLTPVVPVLWE-r 
EAGRSLELKSLRPAWATWGNPISTKINK 


986 


2336 


A 


8325 


89 


1172 


KMNPTDIADTTLDESIYSNYYLYESIPKPCTKE 

FKYKRLRSMTDVYLLNLAISDLLPVFSLPFWG 
YYAADQWVFGLGLCKM1SWMYLVGFYSGIF 
F VMLM S IDRYL AI VHA VFSLRARTLT Y G V ITS 
LATWSVAVFASLPGFLFSTCYTERNHTYCKT 

SMIIRTLQHCKNEKKNKJWKMIFAVVVLFLG 
F WTP YN1 VLFLETL VELE VL QDCTFER YLD Y A 
IQATETLAFVHCCLNPIIYFFLGEKFRKYTLQL 
FKTCRGLFVLCQYCGLLQIYSADTPSSSYTQS 
TMDHDLHDAL 


987 


2337 


A 


8326 


3 


470 


SLS AMRFLAATFLLL ALSTAAQ AEP VQFKDC 
GSVDGVTKEVNVSPCPTQPCQLSKGQSYSVN 
VTFTSNIQSKSSKAVVHGILMGVPVPFPIPEPD 
GCKSGINCPIQKDKTYSYLNKLPVKSEYPSIK 
L V VEWQLQDDKNQ SLFCWEIP VQIVSHL 


988 


2338 


A 


8335 


1205 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPA 

VAEVRLPSATLCYFCRCRLGLGAALFPRSAR 

ALAAS ALP AQG SRWP VLS SPGLP AAFASFPAC 

PQRSYSTEEKPQQHQKTKMTVLGFSNPINWV 

RTRIKAFL1 WA YFDKEFSITEFSEG AKQ AF AH 

VSKLLSQCKFDLLEELVAKEVLHALKEKVTS 

LPDNHKNALAANIDErVFTSTGDISrYYDEKG 

RKFVNILMCFWYLTSANIPSETLRGASVFQVK 

LGNQNVETXQLLSASYEFQREFTQGVKPDWT 

IARJEHSKLLE 


989 


2339 


A 


8349 


67 


185 


MSGFIHQLL1QNLFCVYHTRLKTSQGLCLLSL 
KSLHPMS 


990 


2340 


A 


8361 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPF S IP VQ 

1TLQGSRRRQGRTAFPASGKKRETDYSDGDPL 

DVHKRLP S STGEDRAVMLGFAMMGFS VLMF 

FLLGTTILKPFMLSIQREESTCTAIHTDIMDDW 

LDCAFTCGVHCHGQGKYPCLQVFVNLSHPG 

QKALLHYNEEAVQINPKCFYTPKCHQDRNDL 

LNSALDIKEFFDHKNGTPFSCFYSPASQSEDVI 
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LIKKYDQMAIFHCLFWPSLTLLGGALIVGMV 
RLTQHLSLLCEK Y STV VRDE VGGK VP Y IEQH 
QFKLCIMRRSKGRAEKS 


991 


2341 


A 


8369 


9 


921 


SSV VEF S ALS V SM ACLSPSQLQKFQQDGFL VL 

EGFLSAEECVAMQQRIGEIVAEMDVPLHCRT 

EFSTQEEEQLRAQGSTDYFLSSGDKIRFFFEK 

GVFDEKGNFLVPPEKSINKIGHALHAHDPVFK 

S1THSFKVQTLARSLGLQMPVVVQSMY1FKQP 

HFGGEVSPHQDASFLYTEPLGRVLGVWIAVE 

DATLENGCLWnPGSHTSGVSRRMVRAPVGS 

APGTSFLGSEPARDNSLFVPTPVQRGALVUH 

GKVVHKSKQNLSDRSRQAYTFHLMEASGTT 

WSPENWLQPTAELPFPQLYT 


992 


2342 


A 


8370 


906 


4 


MALSGNCSRYYPREQGSAVPNSFPEWELNV 

GGQVYFTRI ISTLISIPI ISLLWKMFSPKRDTAN 

DLAKDSKGRFFIDRDGFLFRYILDYLRDRQW 

LPDHFPEKGRLKREAEYFQLPDLVKLLTPDEI 

KQSPDEFCHSDFEDASQGSDTRJCPPSSLLPAD 

RKWGFITVGYRGSCTLGREGQADAKFRRVPR 

ILVCGRJSLAKEVFGETLNESRDPDRAPERYTS 

RFYLKFKHLMGAPASNFILGFWGLGQNQDK 

HPVNIYLQQRSVIRPDLTSKKAGDLKGKGDA 

QEVSRRRRWLGDPEHL 


993 


2343 


A 


8379 


1 


2794 


MRMQRHKNDTMDFGDSGKRIGGGVLCLLHQ 

SNTSFIKLNNNGFEDrVIVIDPSVPEDEKnEQIE 

DMVTTASTYLFEATEKRFFFKNVStLIPENWK 

ENPQ YKRPKH ENHKHADVTV APPTLPGRBEP 

YTKQFTECGEKGEYIHFTPDLLLGKKQNEYG 

PPGKLFVHEWAHLRWGVFDEYNEDQPFYRA 

KSKKJEATRCSAGISGRNRVYKCQGGSCLSRA 

CRIDSTTKLYGKDCQFFPDKVQTEKASIMFM 

QSIDS WEFCNEKTHN QEAPSLQNIKCNFRST 

WEVISNSEDFKNTIPMVTPPPPPVFSLLKIRQRI 

VCLVLDKSGSMGGKDRLNRMNQAAKHFLEQ 

TVENGSWVGMVHFDSTATIVNKLIQIKSSDER 

NTLMAGLPTYPLGGTSICSGIKYAFQVIGELH 

SQLDGSEVLLLTOGEDNTASSCIDEVKQSGAJ 

VHFIALGRAADEAV1EMSKITGGSHFYVSDEA 

QNNGLIDAFGALTSGNTDLSQKSLQLESXGLT 

LNSNAWMNDTVIIDSTVGKDTFFLrTWNSLPP 

SISLWDPSGTIMENJFTVDATSKMAYLSIPGTA 

KVGTWAYNLQAKANPETLTITVTSRAANSSV 

PPrTVNAKMNKDVNSFPSPMIVYAEILQGY\'P 

VLGANVTAFTESQNGHTEVLELLDNGAGADS 

FKNDGVYSRYFTAYTENGRYSLKVRAHGGA 

NTARLKLRPPLNRAAY1PGWWNGE1EANPP 

RPEIDEDTQTTLEDFSRTASGGAFWSQVPSL 

PLPDQYPPSQrTDLDATVHEDKIILTWTAPGD 

NFDVGKVQRYIIRISASILDLRDSFDDALQVN 

TTDLSPKEANSKESFAFKPENISEENATHIFIA1 

KSIDKSNLTSKVSNIAQVTLFIPQANPDDIDPT 

PTPTPTPTPDKSHNSGVNISTLVLSV1GSWIV 

NFILSTTI 


994 


2344 


A 


8385 


231 


644 


INSSPRTGRDHQELNLHTERDSRSQRAVLKIP 
RQNPGIFYWIFLPSRSHSASHGSRQRQVSCQG 
TQDEILKMRN1 FAELKNSLEALSSRMDQAEE 
RJGTQAGVQWRDHGSLQPQPPEFKQCFHLSL 
PSSWDYRACLS 


995 


2345 


A 


8390 


194 


3421 


AWRKSSVVPPRGTRRGEKSDQDKSGQKNKR 
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DFLSMKQSPALAPEERCRRAGSPKPVLRADD 

NNMGNGCSQKLATANLLRFLLLVLIPCICALV 

LLLEILLSYVGTLQKVYFKSNGSEPLVTDGEI 

Q G SD VTLTNTIYNQ STW ST AHPDQHVP AWT 

TDASLPGDQSHRNTSACMNITHSQCQMLPYH 

ATLTPLLSV VRNMEMEKFLKFFTYLHRLSC Y 

QHIMLFGCTLAFPE CIIDGDDSHGLLPCRSFCE 

AAKEGCESVLGMVNYSWPDFLRCSQFRNQT 

ESSNVSRICFSPQQENGKQLLCGRGENFLCAS 

GICIPGKLQCNGYNDCDDWSDEAHCNCSENL 

FHCHTGKCLNYSLVCDGYDDCGDLSDEQNC 

DCNPTTEHRCGDGRCIAMEWVCDGDHDCVD 

KSDEVNCSCHSQGLVECRNGQCIPSTFQCDG 

DEDCKDGSDEENCSV1QTSCQEGDQRCLYNP 

CLDSCGGSSLCDPNNSLNNCSQCEPITLELCM 

NLPYNSTSYPNYFGHRTQKEASISWESSLFPA 

LVQTNCYKYLMFFSCTILVPKCDVNTGEHIPP 

CRALCEHSKERCESVLG1VGLQWPEDTDCSQ 

FPEENSDNQTCLMPDEYVEECSPSHFKCRSGQ 

CVLASRRCDGQADCDDDSDEENCGCKERDL 

WECPSNKQCLKHTVICDGFPDCPD YM DEKN 

CSFCQDDELECANHACVSRDLWCDGEADCS 

DSSDRWDCVTT . SINVNS S SF1 .MVFTR A ATEHH 

VPADGWOFII SOLACKOMGLGEPSVTKI IOE 

QFJCEPRWLTLHSNWESLNGTTLHELLVNGQS 

CESRSKISLLCTKQDCGRRPAARMNXRILGGR 

TSRPGRWPWQCSLQSEPSGHICGCVLJAKKW 

VLTVAHCFEGRENAAVWKVVLGrNNLDHPS 

VFMQTRFVKTITLHPRYSRAWDYD ISEVELSE 

DISETGYVRPVCLPNPEQWLEPDTYCYITGW 

GHMGNKMPFKLQEGEVRIISLEHCQSYFDMK 

TITTRMIC AG YESGTV DSCMGD SGGPL VCEK 

PGGRWTLFGLTSW'GSVCFSKVLGPGVYSNVS 

YFVEWIKRQtYIQTFLLN 


996 


2346 


A 


8392 


199 


3085 


KVILSSEMSKTNKSKSGSRSSRSRSASRSRSRS 

FSKSRSRSRSLSRSRKRRL SSRSRSRS YSPAHN 

RERNHPRVYQNRDFRGHNRGYRRPYYFRGR 

NRGFYPWGQYNRGGYGNYRSNWQNYRQAY 

SPRRGRSRSRSPKRRSPSPRSRSHSRNSDKSSS 

DRSRRSSSSRSSSNHSRVESSfCRKSAKEKKSSS 

KDSRPSQAAGDNQGDEVKEQTFSGGTSQDTK 

ASESSKPWPDATYGTGSASRASAVSELSPRER 

SPALKSPLQSWVRRRSPRPSPVPKPSPPLSST 

SQMGSTLPSGAGYQSGTHQGQFDHGSGSLSP 

SKKSPVGKSPPSTGSTYGSSQKEESAASGGAA 

YTKRYLEEQKTENGKDKEQKQTNTDKEKIKE 

KGSFSDTGLGDGKMKSDSFAPKTDSEKPFRG 

SQSPKRYKLRDDFEKKMADFHKEEMDDQDK 

DKAKGRKESEFDDEPKFMSKVIGANKNQEEE 

KSGKWEGLVYAPPGKEKQRKTEELEEESFPE 

RSKKEDRGKRSEGGHRGFVPEKNFRVTAYK 

AVQEKSSSPPPRKTSESRDKLGAKGDFPTGKS 

SFSITREAQVNVRMDSFDEDLARPSGLLAQER 

KLCRDLVHSNKKEQEFRSIFQHIQSAQSQRSP 

SELFAQHTVTIVHHVKEHHFGSSGNfTLHERFT 

KYLKRGTEQEAAKNKKSPEIHRRIDISPSTFRK 

HGLAHDEMKSPREPGYKAEGKYKDDPVDLR 

LDIERRKKHKERDLKRGKSRESVDSRDSSHSR 

ERS AEKTEKTHKG SKKQKJCHRRARJDRSRSS S 

SSSQSSHSYKAEEYTEETEEREESTTGFDK.SRL 
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GTTCDFVGPSERGGGRARGTFQFRARGRGWG 

RGNYSGNKNNNSNNDFQKRNREEEWDPEYT 

PKSKKYYLHDDREGEGSDKWVSRGRGRGAF 

PRGRGRFMFRKSSTSPKWAHDKESGEEGEIE 

DDESGTENREEKDNIQPTTE 


997 


2347 


A 


8398 


202 


552 


CPALGGRQDLQGTRLL WAHDSGVGG QKAKS 
KQENLE SLEATGREEEGGQGPP VTT1CG VLLA 
LLMAGLALQPGTALLCYSCKAQVSNEDCLQ 
VENCTQLGEQCWTARIREWGDDSRQA 


998 


2348 


A 


8400 


697 


301 


NPPSACTPGSCDSCSGRGRDLAFDSVWSTNN 
MSDPRRPNKVLRYKPPPSECNPALDDPTPDY 
MNLLGMIFSMCGLMLKLKWCAWVAVYCSFI 
SFANSRSSEDTKQMMSSFMLSI S A WMS YLQ 
NPQPMTPPW 


999 


2349 


A 


8401 


93 


1126 


ASASHTTSGHLRCPPGSEGVGTMARCFSLVLL 

LTSIWTTRLLVQGSLRAEELSIQVSCRIMG1TL 

VSKKANQQLNFTEAKEACRLLGLSLAGKDQ 

VETALKASFETCSYGWVGDGFVVISR1SPNPK 

CGKNGVGVLIWKVPVSRQFAAYCYNSSDTW 

TKSCIPEUTTKDPIFNTQTATQTTEFIVSDSTYS 

VASPYSTIPAFT1TPPAPASTSIPRRKKLICVTE 

VFMETSTMSTETEPFVENKAAFKNEAAGFGG 

VFT ALL VL ALLFFG AAAG LGFC YVKR YVK A F 

PFTNKNQQKEMIETKV VKEEKANDSNPNE ES 

KKTDKNPEESKSPSKTTMRCLEAEV 


1000 


2350 


A 


8406 


2 


777 


KERCQFWKPMLST VGSFLQDLQNEDKG IKT 
AATFTADGNMISASTLMDILLMNDFKLVINKl 
AYDVQCPKREKPSNErlTAEMEHMKSLVrlRL 
FTELHLEESQKXREHHLLEKIDHLKJEQLQPLE 
QVKAGIE AHSE AKTS GLL W AGL ALLSI QGGA 
LA WLTWWVY S WD IMEPVTYFITF AN SMVFF 
AYFIVTRQDYTYSAVKSRQFLQFFHKKSKQQ 
HFDVQQYNKLKEDLAKAKESLKQARHSLCL 
QMQVEELNEKN 


1001 


Z351 


A 


8410 


1400 


264 


VGFWERPLRSSRWFRRSLRRWEMLARAARG 

TGALLLRGSLLASGRAPRRAS SGLPRNTV VLF 

VPQQEAW VVERMGRPI IRILEPGLNILIP VLDR 

IRYVQSLKEIVINVPEQSAVTLDNVTLQIDGV 

LYLRIMDPYKASYGVEDPEYAVTQLAQTTM 

RSELGKLSLDKVFRERESLNASIVDAINQAAD 

CWGIRCLRYEIKDIHVPPRVKESMQMQVEAE 

RRKRATVLESEGTRESAINVAEGKKQAQILAS 

EAEKAEQINQAAGEASAVLAKAKAKAEAIRI 

LAAALTQHNGDAAASLTVAEQYVSAFSKLA 

KDSNTILLPSNPGDVTSMVAQAMGVYGALT 

KAPVPGTPDSLSSG SS RDVQGTD ASLDEELDR 

VKMS 


1002 


2352 


A 


8421 


134 


941 


NRENLLESRMMDPCSVGVQLRTTNECHKTY 

ytrhtgfktlqelssndmlllqlrtgmtlsg 

nnticfhhvkiy1drfedlqksccdpfnihkkl 

akknlhvidlddatflsakfgrqlvpgwklc 

pkctqiingsvdvdtedrqkrkpesdgrtak 

alrslqrtnpgrqtefapetgkrekrrltkn 

atagsdrqvipakskvydsqgllifsgmdlc 

dcldedclgcfyacpacgstkcgaecrcdrk 

wlyeqieieggeiihnkhag 


1003 


2353 


A 


8427 


3 


1416 


TEWGLS G SCPGC SPLEPG SRGRG AAAWRTLR 

CRRLPEPSPFLTQPNLAQSQPPAPVPVTDPSVT 

MHPAWLSLPDLRCSLLLLVTWVFTPVTTErr 
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SLDTENIDE1LNNADVALVNFYADWCRFSQM 
LHP1FEEASDVIKFEFPNENOVVFARVDCDOH 
SDIAQRYRISKYFTLKLFRNGMMMKREYRGQ 
RSVKALADYIRQQKSDPIQEIRDLAEITTLDRS 
KJtNIIGYFEQKDSDNYRVFERVANILHDDCAF 
L S AFGD VSKPER YSGDNITYKPPGHSAPDM VY 
LGAMTNFDVTYNWIQDKCVPLVREITFENGE 

PT TFFOT PF1 IT FFTMVPTYTF*!! FIFfYNrFVAPnT 

ISEKGTTNFLHADCDKFRHPLLHIQKTPADCP 
V1AIDSFRHMYVFGDFKD\XIPGKLKQFVFDL 
HSGKLHREFHHGPDPTDTAPGEQAQDVASSP 
PESSFQKLAPSEYRYTLLRDRDEL 










0111 




ACAAARSPADQDRFI CI YPA YLNNKKTI AEGR 

RIPISKAVENPTATEIQDVCSAVGLNVFLEKN 

KMYSREWNRDVQYRGRVRVQLKQEDGSLC 

DQSLQQGEGSKKGKGKKKK 


1005 


2355 


A 


8453 


90 


530 


QSHETKMQSGTHWRVLGLCLLSVGVWGQD 
GNEEMGGITQTPYKVSISGTTVILTCPQYPGSE 
ILWQHNDKNIGGDEDDKNIGSDEDHLSLKEF 
SELEQSGYYVCYPRGSKPEDANFYLYLRARG 


1006 


2356 


A 


8458 


3 


3Q7 


A VQRIRHEMNIFRLTGDLS HL AA1 VILLLK I W 
KTRSCAG1SGKSQLLFALVFTTRYLDLFTSF1S 
LYNTSMKVWYA1HRNVFHLQCTGLWTLNLC 


1007 


2357 


A 


8459 


43 


553 


G AGAGGD W AAMDKLKKVLS GQDTEDRSG L . 

SEVVEASSLSWSTRIKGFIACFAIGILCSLLGT 

VLLWVPRKGLFILFA VFYTFGN1AS IGSTI FLM 

GPVKQLKRMFEPTRLIATIMVLLCFALTLCSA 

FWWHNKGLALIFCILQSLALTWYSLSFTPFAR 

DAVKKCFAVCLA 


1008 


2358 


A 


8462 


487 


150 


AQDIRSVHSLGQKSTFVKHFRTLSHLHGLPDP 
PPHWPPQERSPPSHPCMPSHRPQIPQLSNSGPS 

FlPP WOPVfrP^X/TPT^TrT PfiAVFA«;TTK - A ( 3T P 

KCPVDSSLPTPEACFL 


1009 


2359 


A 


8465 


134 


954 


ETR VKTSLELLRTQLEPTGT VGNTIMTS QPVP 

NETIIVLPSNVINFSQAEKPEPTNQGQDSLKKH 

LHAEIKVIGTIQILCGMMVLSLGIILASASFSPN 

TKLLVHSSLVGSILSALSALVGFIILSVKQATL 
NPASLQCELDKNNIPTRSYVSYFYHDSLYTTD 
CYTAKASLAGTLSLMLICTLLEFCLAVLTAVL 
RWKQAYSDFPGSVLFLPHSYIGNSGMSSKMT 
HDCGYEELLTS 


1010 


2360 


A 


8468 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVA 
1 IRVALCHLAGCQEQAA W YHTLQILFFL VS A Y 
FFSCPVPEK.YFPGSCDIVGHGHQIFHAFLSICT 
LSQLEAILLDYQGRQEIFLQRHGPLSVHMACL 
SFFFLAACSAATAALLRHKVKARLTKKDS 


1011 


2361 


A 


8478 


5 


409 


TELSQLEKAHPPADMGRRKSK_RJCPPPKKKMT 
GTLETQFTCPFCNHEKSCDVKMDRARNTGVI 
SCTVCLEEFQTPITC1LGNLGFFQRVGRGLESG 
PCSSGPLCALVQGQSRPEEQVPPSDFCGVRRC 
RAGFQCQ 


1012 


2362 


A 


8481 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPENDRM 
RMKYGGQEFWADLNAM>rVYETTEFDQLRR 
LSTPPSSNVNSrYHTVWKFFOlDHFGWREYPE 
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SVIRIJEEANSRGLKEVRFMMWNNHYILHNS 

FFRKEIKRRPLFRSCFILLPYLQTLGGVFTQAP 

PPLEATSSSQIICPDGVTSANFYPETWVYMHP 

SQDFIQ VPVS AEDKS YRIIYNLFHKTVPEFKYR 

ILQrLRVQNQFLWEKYKRKKEYMNRKMFGR 

DRIINERHLFHGTSQDWDGICKHNFDPRVCG 

KHATMFGQGS YFAKKAS YSHNFSKKS SKG V 

HFMFLAKVLTGRYTMGSHGMRRPPPVNPGS 

VTSDLYDSCVDNFFEPQIFVEFNDDQSYPYFVI 

QYEEVSNTVSI 


1013 


2363 


A 


8488 


2 


517 


IENCRTRLRQAWHEVCGNKMAAPIPQGFSCL 

SRFLGWWFRQPVLVTQSAAIVPVRTKKRFTP 

PIYQPKFKTEKJEFMQHARKAGLVIPPEKSDRS 

IHLACTAGIFDAYVPPEGDARISSLSKEGLIER 

TERMKKTMA S Q V S IRRIKD Y D ANF KIKD F PF 

KAKD1FIEGSPLY 


1014 


2364 


A 


8501 


363 


17 


Y1RTG YVY I CDY AQLMYTYYIRTA YVYICILY 
AQLMYTYVLYTHSLCIHMYSIRTAYVY1CIIY 
AQIMYTYVFYTHRLCIHMYSIRTDYVYICILY 
AQLMYTYVFYTHSYMSDE 


1015 


2365 


A 


8504 


3 


2190 


NSSEHFS Q APQRLSF YSWYGS ARLFRFRVPPD 

AVLLRWLLQVSRESGAACTDAErrVHFRSGA 

PPVINPLGTSFPDDTAVQPSFQVGVPLSTTPRS 

NAS VN VSHPAPGD WFVAAHL PPSSQKIELKG 

LAPTCAYVFQPELLVTRWEISIMEPDVPLPQ 

TLLSHPSYLK.VFVPDYTRELLLELRDCVSNGS 

LGCP VRLTVGP VTLPSNFQKVLTCTG AP WPC 

RLLLPSPPWDRWLQVTAESLVGPLGTVAFSA 

VAALTACRPRSVTIQPLLQSSQNQSFNASSGL 

LS PS PDHQDLGRSGRVDRSPFCLTNYP VTRED 

MD VV S VHFQPLDRV S VRVCSDTPSVMRLRL 

NTGMDSGGSLTIS LRANKTEMRNETVV VAC V 

NAASPFLGFNTSLNCTTAFFQGYPLSLSAWSR 

RANLIIPYPETDNWYLSLQLMCPENAEDCEQ 

AVVHVETTLYLVPCLNDCGPYGOCLLLRRHS 

YLYASCSCKAGWRGWSCTDNSTAQTVAQQR 

AATLLLTLSNLMFLAPIAVSVRRFFLVEASVY 

AYTMFFSTFYHACDQPGEAVLCILSYDTLQY 

CDFLGS G AAIWVTELCMARLKTVLKYVLFLL 

GTLVIAMSLQLDRRGMWNMLGPCLFAFVIM 

ASMWAYRCGHRRQCYPTSWQRWAFYI.LPG 

V SMAS VGIAIYTSMMTSDNYYYTHSIWHILL 

AGSAALLLPPPDQPAEPWACSQKFPCHYQIC 

KNDREELYAVT 


1016 


2366 


A 


8511 


1 


453 


KWYPSGPVRIPGRFYYTCLPAGHRRCRMAPAK 

KGGEKKKGRSAINEVVTREYTINIHKRIHGVG 

FKXRAPRALKEIRKTAMKEMGTPDVRIDTRL 

NKAVWAKGIRNVPYR1RVRLSRKRNEDEDSP 

NKLYTLVTYVPVTTFKNLQTVNVDEN 


1017 


2367 


A 


8513 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINT 

LSAKWADNFMAEGCGGSKEHSFQHPFLQAV 

GMFLGEFSCLAAFYLLRCRAAGQSDSSVDPQ 

QPFNPLLFLPPALCDMTGTSLMYVALNMTSA 

SSFQMLRGAVIIFTGLFSVAFLGRRLVLSQWL 

GILATIAGLVWGLADLLSKHDSQHKLSEVIT 

GDLLnMAQirVAIQMVLEEKFVYKHNVHPLR 

AVGTEGLFGFVILSLLLVPNfYYTPAGSFSGNP 

RGTLEDALDAFCQVGQQPLIAVALLGNISSIA 

FFNFAGISVTKELSATTRMVLDSLRTVVIWAL 
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SLALGWEAFHALQILGFLILLIGTALYNGLHR 
PLLGRLSRGRPLAEESEQERLLGGTRTPINDA 
S 


1018 


2368 


A 


8518 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALV 

VSOGrVGYYKTGSVPSLAAGLLFGSLAGLGA 

YQLYQDPRNVWGFLAATSVTFVGVMGMRS 

YYYGKFMPVGLIAGASLLMAAKVGVRMLM 

TSD 


1019 


2369 


A 


8526 


2 


1787 


VSAAAVNMEPPDAPAQARGAPRLLLLAVLL 

AAHPDAQAEVRLSVPPLVEVMRGKSVILDCT 

PTGTHDHYMLEWFLTDRSGARPRLASAEMQ 

GSELQVTMHDTRGRSPPYQLDSQGRLVLAEA 

QVGDERDYVCVVRAGAAGTAEAAARLNVF 

AKPEATEVSPNKGTLSVMEDSAQELATSNSRN 

GNPAPKITWYRNGQRLEVPVEMNPEGYMTS 

RTVREASGLLSLTSTLYLRLRKDDRDASFHC 

AAHYSLPEGRHGRLDSPTFHLTLHYPTEHVQ 

F WVGSPSTP AG WVREGDTVQLLCRGDGSPSP 

EYTLFRLQDEQEEVLNVNLEGNLTLEGVTRG 

QSGTYGCRVEDYDAADDVQLSKTLELRVAY 

LDFLELSEGKVLSLPLNSRAWNCSVHGLPTP 

ALRWTKDSTPLGDGPMLSLSS1TFDSNGTYVC 

EASLPTVPVLSRTQNFTLLVQGSPELKTAEIEP 

KADGSWREGDE VTL ICS ARGHPDPKLS WSQL 

GGSPAEPIPGRQGWVSSSLTLK.VTSAJLSRDGI 

SCEA SNPHGNKRHVFHFGTVSPQTSQ AG VA V 

MAVAVSVGLLLLVVAVFYCVRRKGGPCCRQ 

RREKGAP 


1020 


2370 


A 


8530 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHS 

SPLLPHAMKSPFYRCQNITSVBK.GNSAVMGG 

VLFSTGLLGNLLALGLLARSGLGWCSRRPLR 

PLPSVPYMLVCGLTVTDLLGKCLLSPWLAA 

YAQNRSLRVLAPALDNSLCQAFAFFMSFFGL 

SSTLQLLAMALECWLSLGHPFFYRRHITLRLG 

ALVAPWSAFSLAFCALPFMGFGKFVQYCPG 

TWCFIQMVHEEG SLS VLGYSVLY S S LM ALL V 

LATVLCNLGAMRNLYAMHRRLQRHPRSCTR 

DCAEPRADGRJEASPQPLEELDHLLLLALMTV 

LFTMCSLPVTYRAYYGAFKDVKEKNRTSEEA 

EDLRALRFLSVISIVDPWIFIIFRSPVFRIFFHKI 

FIRPLRYRSRCSNSTNMESSL 


1021 


2371 


A 


8536 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPR 
KHDSGAADLERVTDYAEEKEIQSSNLETAMS 
VIGDRRSREQKAKQER 


1022 


2372 


A 


8537 


94 


541 


RKERRJRRRRRMEAVVFVFSLLDCCALIFLSV 
YFHTLSDLECDYINARSCCSICLNKWV1PELIG 
HTIVTVLLLMSLHWFIFLLNXPVATWNTYRYI 
MVPSGNMGVPDPTE1HNRGQLKSHMKEAMI 
KLGFHLLC FFMYL YSM I L ALIND 


1023 


2373 


A 


8540 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPL 
SLVVHEGDTVTLNCSYEVTKFRSLLWYKQEK 
KAPTFLFMLTSS GIEKKSGRLS SILDKKELSSIL 
NITATQTGDSAIYLCAVEAQCSLVTCSLYSNS 
TAEALQL 


1024 


2374 


A 


8544 


1731 


743 


GVRLRYSPIAVVMVGEAGRDLRRRRAVAVT 

AEKMAVLAPLIALVYSVPRLSRWLAQFYYLL 

SALLSAAFLLVRKLPPLCHGLPTQREiXiNPCD 

FDWP^VEE.MFLSAJVMMKNRRSITVEQHIGN 

IF^IFSKVAWILrTRLDIRMGLLYrrLCiVFLM 
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TCKPPLYMGPEYIKYFNDKTJDEELERDKRVT 

WIVEFFANWSNDCQSFAPIYADLSLKYNCTG 

LNFGKVDVGRYTDVSTRYKVSTSPLTKQLPT 

LILFQGGKEAMRRPQIDKKGRAVSWTFSEEN 

VIREFNLNELYQRAKKLSKAGDNIPEEQPVAS 

TPTTVSDGENKKDK 


1025 


2375 


A 


8546 


2194 


1707 


TVSFHKTMASLKC STVVC VICLEKPKYRCP A 

CRVPYCSVVCFRKHKEQCNPETRPVEKKIRS 

ALPTKTVKPVENKDDDDSIADFLNSDEEEDR 

VSLQNLKNLGESATLRSLLLNPHLRQLMVNL 

DQGEDKAKLMRAYMQEPLFVEFADCCLG1V 

EPSQNEES 


1026 


2376 


A 


8547 


1078 


594 


VGMELPAVNLKV1LLGHWLLTTWGCTVFSGS 
YAWANFTILALGVWAVAQRDSIDA1SMFLGG 
LL ATTFLDIVHI S IFYPRVS LTDTG RFG VG MAIL 
SLLLKPLSCCFVYHMYRERGGELLVHTGFLG 
S SQDRS A YQTIDS AEAP ADPFA VPEGRSQD AR 
GY 


1027 


2377 


A 


8557 


1 


340 


DFLGPASPQEEGGSESSTMTELETAMGMITDV 
FSRYSGSEGSTQTLTKGELKVLMEKELPGFLQ 
SGKDKD AVDKLLKDL D ANGD AQ VDFSEFIVF 
VAAITSACHKYFEKAGLK 


1028 


2378 


A 


8569 


20 


963 


KMAATLGPLG S WQQ WRRCLS ARDGSRRL L L 

LLLLGSGQGPQQVGAGQTFEYLFCREHSLSKP 

YQGE APRPCFLRD WELQ VHFKIHG QGKKNL 

HGDGLAIWYTKDRMQPGPVFGNMDKFVGLG 

VFVDTYPNEEKQQERVFPYISAMVNNGSLSY 

DHERDGRPTELGGCTA IVRNLHYDTFI ,VTRY 

VKRHLTIMMDIDGKHEWRDCIEVPGVRLPRG 

YYFGTSSITGDLSDNHDVISLKLFELTVERTPE 

EEKLHRDVFLPSVDNMKLPEMTAPLPPLSGL 

ALFLIVFFSLVFSVFAIVIGIILYNKWQEQSRK 

RFY 


1029 


2379 


A 


8572 


1 


578 


AAAASHRSRARSRPRRVS SGP APRRAQ SSAG 

RVASGLDSAPLCTMARALCRLPRRGLWLLLA 

HHLFMTTACQEANYGALLRELCLTQFQVDM 

EAVGETLWCDWGRTIRSYRELADCTWHMAE 

KLGCFWPHAEVDRFFLAVHGRYFRSCPISGR 

AVRDPPGSILYPFIWPITVTLLVTALVVWQS 

KRTEGIV 


1030 


2380 


A 


8574 


1352 


372 


DSSTVKGGSESRHLCL1PDLKGKARTREASSG 

SRTCGRRTSLCTSAKSSWTYRSGRLSWQSIKG 

THLTTTQALRQPLHRAPLLPGQLCWSPRPLEK 

NKAMGRPLLLPLLLLLQPPAFLQPGGSTGSGP 

S YL YG VTQPKHLS ASMGGS VEIPFSFYYP WEL 

AIVPNVRISWRRGHFHGQSFYSTRPPSIHKDY 

VNRLFLNWTEGQESGFLRISNLRKEDQSVYF 

CRVELDTRRSGRQQLQSIKGTKLTITQAVTTT 

TTWRPSSTTTIAGLRVTESKGHSESWHLSLDT 

AIRVALAVAVLKTVILGLLCLLLLWWRRRKG 

SRAPSSDF 


1031 


2381 


A 


8580 


905 


340 


RRTAGIYPCFPKPGRTRHALCSWLLLLTGQL 

AFDDFQESCAMMWQKYAGSRRSMPLGARIL 

FHGVFYAGGFAIVYYLIQKFHSRALYYKLAV 

EQLQSHPEAQEALGPPLNIHYLKLIDRENFVD1 

VDAKLKJPVSGSKSEGLLYVHSSRGGPFQRW 

HLDEVFLELKDGQQIPVFKLSGENGDEVKKE 


1032 


2382 


A 


8593 


2558 


961 


RRRPRLLPGAEPCEPRVGPRRADMGCSAKAR 
W AAG ALG VAG LLCA VLG A VMI VMVPSLfKQ 



295 



Printed from Mimosa 03/03/06 11:12:13 Page: 296 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F^Phenylaianine, G-GJyctne, H-Histidine, 
Msoleucinc, K- Lysine, L-Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S=Serine J 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosinc, X= Unknown, *~Stop codon, 
/=possible nucleotide deletion, V=possiblc 
nucleotide insertion 














QVLK^VRIDPSSLSF>TM^TCEIPIPFYLSVYFFD~ 
VMNPSE1LKGEKPQVRERGPYVYREFRHKSNI 
TFNNNDTVSFLEYRTFQFQPSKSHGSESDYTV 
MPNILVLGAAVMMEMKPMTLKLIMTLAFTTL 
GERAFN/ThJRTVGFrMWrrYVnPI VMT rXTK'VPT* 

GMFPFKDKFGLFAELNNSDSGLFTGFTGVQNJ 
SRIHLVDKWNGLSKVDFWHSnOf^lMTlsro'rv: 

GQMWPPFMTPESSLEFYSPEACRSMKLMYKE 

SGVFEGIPTYRFVAPKTLFANGSrYpPNEGFCP 

CLESGIQNVSTCRFSAPLFLSHPHFLNADPVL 

AEAVTGLHPNQEAHSLFLDIHPVTGrpMNC&V 

KLQLSLYMKSVAGIGQTGKIEPWLPLLWFA 

ESGAMEGETLH7TYTOLVT WPKVWHVAnw 

LLALGC VLLL V P VICQ1RSQEKC YLFW SSSKK 
GSKDKEAIQAYSESLMTSAPKGSVLQEAKL 


1033 


2383 


A 


8595 


595 


767 


AHLPDTLLLPPHSPTVFTPKSFQCSQKACFSRS 
FCLLLSLVSSSLVSLSLCPPLTQA 


1034 


2384 


A 


8597 


640 


164 


VTTSCI1PFAFGLGVRASERLAEIDMPYLLKYQ 

VNMGDRTSMVQDPGSQAPTSWISESQVFQTT 
EVLTTRITELQRRFPT WTPDQYLRGGLC AYS G 
GAGYVRSSQDLSCDFCNDVLARAKYLKRHG 
F 


1035 


2385 


A 


8603 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADL 
avviyriYiArrlULWurr lUVoKVLanHLrbl IG 
SLSAIQKMTRVRVVDNSALGNSPYHRAPRCI 
I IVYKKNGVGKVGDQILLAIKGQKKKAL1VG 
HCMPG PRMTPRFDSNNV VLIEDNGNPVGTRJ 
KTPIPTSLRKREGEYSKVLAIAQNFV 


1036 


2386 


A 


8606 


1 


562 


PTRAHSFDLCCSPCRRRLLGREEAGEEF1SPV 

TQYLQPRSPEECKMFACAKLACTPSL1RAGSR 

VAYRPISASVLSRPEASRTGEGSTVFNGAQNG 

VSQLIQREFQTSAISRD1DTAAKFIGAGAATVG 

VAGSGAGIGTVFGSLnGYARNPSLKQQLFSY 

AILGFAI .SEAMGLFCI.MVAFI JLFAM 


1037 


2387 


A 


8615 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDT 

GMVAHINNSRJLKAKGVGQHDNAQNFGNQSF 

EELRAACLRKGELFEDPLFPAEPSSLGFKDLG 

PNSKNVQNISWQRFKDIINNPLFIMDGISPTDI 

CQGILGDCWLLAAIGSLTTCPKLLYRWPRG 

QSFKKNYAGIFHFQIWQFGQWVNVWDDRL 

PTKNDKLVFVHSTERSEFV/SALLF.KAYAKLS 

GSYE ALS GGSTMEGLEDFTGGVAQSFQLQRP 

PQNLLRLLRKAVERSSLMGCSIEVTSDSELES 

MTDKMLVRGHAYSVTGLQDVHYRGKMETLI 

R VRNPWGRIE WNGA WSDS ARE WEE VASDIQ 

MQLLHKTEDGEFWMSYQDFLNNFTLLEICNL 

TPDTL S GDYKS Y WHTTFYEGS WRTG SS AGGC 

RNHPGTF WTNPQFKI SLPEGDDPEDD AEGNV 

WCTCLVALMQKNWRHARQQGAQLQTIOFV 

LYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEI 

FTNSREVSSQLRLPPGEYUIPSTFEPHRDADFL 

LRVFTEKHSESWELDEVNYAEQLQEEKVSED 

DMDQDFLHLFKTVAGEGKEIGVYELQRLLNR 

rvtAIKFKSFKTKGFGLDACRCMINLMDKDGSG 

KLGLLEFKJLWKXLKKWMDIFRECDQDHSGT 

LNSYEMRLV1EKAGIKLNNKVMQVLVARYA 

DDDLFIDFDSFISCFLRLKTMFTFFI.TMDPKNT 

GH1CLSLEQVLGEGWEGICRIAPACPSTPPPPS 



296 



Printed from Mimosa 03/03/06 11:12:14 Page: 297 



WO 01/57188 



PCT/U SO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F-Phenylalanine, G-Glycine, H-Histidinc, 
I=Iso leucine, K^Lysinc, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y— Tyrosine, X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion, \— possible 
nucleotide insertion 














SDVPGPASCPRLFPPWDLLPVSTVAADDHVG1 
HAL 


1038 


2388 


A 


8621 


3 


1494 


RSRMARAPLGVIXLLGLLGRGVGKNEELRJLY 

HHLFNNYDP G SRP VREPEDTVTI SLK VTLTNL 

ISLNEKEETLTPSVWIGIDWQDYRLNYSKDDF 

GGIETLRVPSELVWLPEIVLENN1DGQFGVAY 

DANVLVYEGGSVTWLPPAIYRSVCAVEVTYF 

PFD WQNCSL I F RSQT YN AEE VEFTF A VDNDG 

KTINKIDIDTEAYTENGEWAIDFCPGVIRRHH 

GG ATDGPGETD VI YSL1IRRKPLF YV IKII VPC V 

LISGLVLLAYFLPAQAGGQKCTVSrNVLLAQT 

VFLFLIAQK1PETSLSVPLLGRFLIFVMVVATLI 

VMNCVIVLNVSQRTPTTHAMSPRLRHVLLEL 

LPRLLGSPPPPEAPRAASPPRRASSVGLLLRAE 

ELILKKPRSELVFEGQRHRQGTWTAAFCQSL 

GAAAPEVRCCVDAVNFVAESTRDQEATGEE 

VSDWVRMGNALDNICFWAALVLFSVGSSLIF 

LGAYFNRVPDLPYAPCIQP 


1039 


2389 


A 


8636 


1 


900 


PGRERPGGGGARRRPQHLPALLPSERPDCATL 

QAMENELPVPHTSSSACATSSTSGASSSSGCN 

NSSSGGSGRPTGPQISVYSGIPDRQTVQVIQQ 

ALHRQPSTAAQYLQQMYAAQQQHLMLQTA 

ALQQQHLSSAQLQSLAAVQQASLVSNRQGST 

SGSN VSAQAPA QSSSINLAASPAAA QLLNRA 

QSVNSAAASGIAQQAVLLGNTSSPALTASQA 

QMYLRAQMLIFTPTATVATVQPELGTGSPAR 

PPTPAQVQNLTLRTQQTPAAAASGPTPTQPVX, 

PSLAJLKPTPGGSQPLPTPA 


1040 


2390 


A 


8645 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSF 

HEHRHQ SGRCL STGMAPNLKGRPRKKKPCPQ 

RRDSFSGVKDSNNNSDGKAVAKVKCEARSA 

LTKPKNNHNCKXVSNEEKPKVAJGEECRADE 

QAFLVALYKYMKERKTPIERIPYLGFKQINLW 

TMFQAAQKLGGYETITARRQWKHTYDELGG 

NPGSTSAATCTRRHYERLILPYERFIKGEEDKP 

LPPIKPRKQENSSQENENKTKVSGTKRIKHEIP 

KSKKEKENAPFGPQDAAEVSSEQEKEQETLISQ 

KSIPEPLPAADMKKKIEGYQEFSAKPLASRVD 

PEKDhTETDQGSNSEKVAEEAGEKGPTPPLPSA 

PLAPEKDSALVPGASKQPLTSPSALVDSKQES 

KIXCFTESPESEPQE A SFPRLPHHTGHRWQTR 

MRRRMTNCPPWQITLPTAP 


1041 


2391 


A 


8646 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTrYP 

GIKAR1TQRALDYGVQAGMKMIEQMLKEKK 

LPDLSGSESLEFLKVDYVNYNFSN1KJSAFSFP 

NTSLAFVPGVG1KALTNHGTANISTDWGFESP 

LFVLYNSFAEPMEKPrLKNLNEMLCPIIASEVK 

ALNANLSTLEVLTKIDNYTLLDYSLISSPErrE 

NYLDLNLKGVFYPLENLTDPPFSPVPFVLPER 

SN SMLY IGIAEYFFKS ASFAHFTAQVFKVTL S 

TEEISNHFVQNSQGLGNVLSRIAEIYILSQPFM 

VRIMATEPPHNLQPGNFTLDIPASIMMLTQPK 

NSTVET1 V SMDFVA STS VGL VILGQRLVCSLS 

LNRFRLALPESNRSNffiVLRFENILSSILHFGVL 

PLANAKLQQGFPLPNPHKFLFVNSDIEVLEGF 

LLISTDLKYETSSKQQPSFHVWEGLNLISRQW 

RGKSAP 


1042 


2392 


A 


8672 


538 


170 


ARRIARTRESKAA VSQDN VPALQPGKJCKKLR 
LGGKKKKFKFFRLPKEFKKQLMYSPSNFKKM 
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TSLAGNTVQCLNKLKYVIYSAQYPAYGNITT 
LDMITSTDHVLEQDFWICFTFY SVTCERQI 


1043 


2393 


A 


8688 


359 


17 


GLKTRAPATPTFQREVLGPAKQDMQRRCPR1 
GLMTSLLKPDCRRWRDYKRWKSGGFTGESC 
HHADTLGDRGGLQGDHSELLQWQKRILRTE 
GEPSPKYISKNIFPICSYITGFL 


1044 


2394 


A 


8718 


292 


1490 


GTVKTSVATPITAGHSCSSGGVLQVKSPATQS 

GFKFTSKMEDFNMESDSFEDFWKGEDLSNYS 

YSSTLPPFLLDAAPCEPESLEINKYFWIIYAL 

VFLLSLLGNSLVMLVILYSRVGRSVTDVYLL 

NLAL ADLLFALTLPI WAA SK VNG WIFGTFL C 

KWSLLKEVNFYSGILLLACISVDRYLA1VHA 

TRTLTQKRYLVKFICLSIWGLSLLLALPVLLFR 

RTVYSSNVSPACYEDMGNNTANWRMLLRIL 

PQSFGFIVPLLIMLFCY GFTLRTLFKAHMGQK 

HRAMRVIFAVVLIFLLCWLPYNLVLLADTLM 

RTQVIQETCERRNHIDRALDATE1LGILHSCLN 

PLIYAFIGQKFRHGLLKILAIHGLISKDSLPKDS 

RPSFVGSSSGHTSTTL 


1045 


2395 


A 


8724 


254 


3184 


FRANL ArTV ANRRG AQGGKMHTCCPP VTLEQ 

DLHRKMHSWMLQTLAFAVTSLVLSCAETIDY 

YGE1 CDN ACPCEEKDGILTVSCENRGII SLSEIS 

PPRFPIYHLLLSGNLLNRLYPNEFVNYTGASIL 

HLGSNVIQDIETGAFHGLRGLRRLHLNNNKL 

ELLRDDTFLGLENLEYLQVDYNYISVIEPNAF 

GKLHLLQ VLILNDNLLS S LPNNLFRF VPLTHL 

DLRGNRLKJLLP YVGLLQI IMDKWELQLEEN 

PWNCSCELISLKDWLDSISYSALVGDVVCETP 

FRLHGRDLDEVSKQELCPRRLISDYEMRPQTP 

LSTTGYLHTTPASVNSVATSSSAVYKPPLKPP 

KGTRQPNKPRVRPTSRQPSKDLGYSNYGPSIA 

YQTKSPVPLECPTACSCNLQISDLGLNWCQE 

RKESIAELQPKPYNPKKMYLTENYIAWRRT 

DLLEATGLDLLHLGNNRI SMI QDRAFGDLTN 

LRRLYLNGNR1ERLSPELFYGLQSLQYLFLQY 

NLIREIQSGTFDPVPNLQLLFLNNNLLQAMPS 

G VFSGL'IXLRLN LRSNHFTSLP VSG VLDQLKS 

LIQIDLHDNPWDCTCDIVGMKLWVEQLKVG 

VLVDEVICKAPKKFAETDMRSIKSELLCPDYS 

DVWSTPTPSSIQVPARTSAVTPAVRLNSTGA 

P ASLG AG GGASS VPLS VLIL SLLL VFIMS VFV A 

AGLFVLVMKRRKKNQSDHTSTNNSDVSSFN 

MQYSVYGGGGGTGGHPHAHVHHRGPALPK 

VKTPAGHVYEYIPHPLGHMCKNPIYRSREGN 

SVEDYKX>LHELKVTYSS>4HHLQQQQQPPPPP 

QQPQQQPPPQLQLQPGEEERRESHHLRSPAYS 

VSTIEPREDLLSPVQDADRFYRGILEPDKHCST 

TPAGNSLPEYPKFPCSPAAYTFSPNYDLRRPH 

QYLHPGAGDSRLREPVLYSPPSAVFVEPNRNE 

YLELKAKLNVEPDYLEVLEKQTTFSQF 


1046 


2396 


A 


8736 


28 


452 


SPSAAGGLAWVSLALGSGSRGRDHSGSGVGT 

AMAGALVRKAADYVRSKDFRDYLMSTHFW 

GPVANWGLPIAAINDMKKSPEIISGRMTFALC 

CYSLTFMRFAYKVQPRNWLLFACHATNEVA 

QLIQGGRLIKHEMTKTASA 


1047 


2397 


A 


8741 


673 


924 


ALPGTPQQTVTLNTDGKVKSFTSPHSNPNLPP 
AKFFTSLQSLNWSSHLPPSPATESVGKRGNAK 
PPTTKLLHSSPLWNFFAQQL 


1048 


2398 


A 


8747 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKR 
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VAVPNGQPPSAARYMPREVPPRFRCQQDHK 

VLLKRGQPPPPSCMLLGGGAGPPPCTAPGAN 

PNNAQVTGALLQSESGTAPDSTLGGAAASNY 

ANSTWGSGASSNNGTSPNPIHlWDKVrVDGS 

DMEEWPCIASKDTESSSENTTDNNSASNPGSE 

KSTLPQSTTSXKQKGSQCQSASSGNECNLGV 

WKSDPKAKSVQSSNSTTENNNGLGNWRKVS 

GQDRIGPGSGFSNFNPNSNPSAWPALVQEGTS 

RKGALETDNSNSSAQVSTVGQTSREQQSKME 

NAGVNPWSGREQAQIHNTTXjPKNGNTNSL 

NLSSPNPMENKGMPFGMGLGNTSRSTDAPSQ 

STGDRKTGSVGSWGAARGPSGTDTVSGQSNS 

GNNGNNGKEREDS WKGAS VQKSTG SKNDS 

WDNNNRSTGGSWNFGPQDSNDNKWGEGNK 

MTSGVSQGE WKQPTG SDELKJGE WSGPNQPN 

SSTGAWDNQKGHPLLENQGNAQAPCWGRSS 

SSTGSEVEGQSTGSNHKAGSSDSHNSGRRSY 

RPTHPDCQAVLQTLLSRTDLDPRVLSNTGWG 

QTQnCQDTVWDtEEVPRPEGKSDKGTEGWES 

AATQTKNSGGWGDAPSQSNQMKSGWGELS 

ASTEWKDPKNTGGWNDYKKNNSSNWGGGR 

PDEKTPSSWNENPSKDQGWGGGRQPNQGWS 

SGKNGWGEEVDQTKNSNWESSASKPVSGWG 

EGGQNEIGTWGNGGNASLASKGGWEDCKRS. 

PAWNETGRQPNSWNKQHQQQQPPQQPpppQ 

PE ASGSWGGFPPPPPGNVRPSNSS WSSGPQPA 

TPKDEEPSG WEEPSPQ SISRKMDIDDGTS A WG 

DPNSYNYKNVNLWDKNSQGGPAPREPNLP7T 

MTSKSASDSKSMQDGWGESDGPVTGARHPS 

WEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GiCKQMKCSLKGGNNDSWMNPLAKQFSNMG 

LLSQTEDNPSSKMDLSVGSLSDJCKFDVDKRA 

MNLGDFNDIMRKDRSGFRPPNSKDMGTTD S 

GPYFEKGGSHGLFGNSTAQSRGLHTPVQPLN 

SSPSLRAQVPPQFISPQVSASMLKQFPNSGLSP 

GLFNVGPQLSPQQIAML SQLPQIPQFQLACQL 

LLQQQQQQQLLQNQRK1SQAVRQQQEQQLA 

RMVSALQQQQQQQQRQPGMKHSPSHPVGPK 

PHLDNMVPNALN VGLPDLQTKGPIPG YG SGF 

SSGGMDYGMVGGKEAGTESRFKQWTSMME 

GLPSVATQEANMHKNGAJVAPGKTRGGSPY 

NQFDHPGDTLGGHTGPAGDSWLPAKSPPTNK 

IGSKSSNASWPPEFQPGVPWKGIQNIDPESDP 

xvi x vj^ vL,uij i a i i>ri vu I UHQLLRDNTTGS 

NSSLNTSLPSPGAWPYSASDNSFTNVHSTSAK 

FPDYKSTWSPDPIGH>IFrHLSNKMWKNHISS 

RNTTPLPRPPPGLTNPKPSSPWSSTAPRSVRG 

WGTQDSRLASASTWSDGGSVRPSYWLVLHN 

LTPQIDGSTLRTICMQHGPLLTFHLNLTQGTA 

LIRYSrTKQEAAKAQTALHMCVLGNTTILAEF 

ATDDEV SRFLAQ AQPPTPAATPS APAAG WQS 

LETGQNQSDPVGPALNLFGGSTGLGQWSSSA 

GGSSGADLAGASLWGPPNYSSSLWGVPTVED 

PHRMGSPAPLLPGDLLGGGSDSI 


1049 


2399 


A 


8748 


200 


1387 


VPWKRQDEQLSLQVETLYLDSPAVrHLLSPTF 

LPPSSLPPFLQIVDSSSSACTLDSFFPFLAPWDS 

PQDCGFKDHQPLTLQALTVELARWTLMLLLS 

TAMYGAHAPLLALCHVDGRVPFRPSSAVLLT 

ELTKLLLCAFSLLVGWQAWPQGPPPWRQAA 

PFALSALLYGANNNLVmQRYMDPSTYQVL 
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S^KIGSTAVLYCLCLRHRLSVRQGLALLLL 

MAA GACYAA G GLQ VPGNTLPSPPPAAAASP 

MPLHTTPLGLLLLILYCLISGLSSVYTELLMKR 

QRLPL ALQNLFL YTFGVLLNLGLHAG GGSGP 

GLLEGFSGWAALVVLSQALNGLLMSAVMKH 

GSSITRLFVVSCSLVVNAVLSAVLLRLQLTAA 

FFLATLLIGLAMRLYYGSR 


1050 


2400 


A 


8758 


3 


1660 


W VS SMGFEELLEQ VG GFGPFQLRN V ALLALP 

RVLLPLHFLLPIFLAAVPAHRCALPGAPANFS 

HQDVWLEAHLPREFDGTLSSCLRFAYPQALP 

NTTLGEERQSRGELEDEPATVPCSQGWEYDH 

SEFSST1ATESQWDLVCEQKGLNRAASTFFFA 

GVLVGAVAFGYLSDRFGRRRLLLVAYVSTLV 

LGLASAASVSYVMFAJTRTLTGSALAGFTIIV 

MPLELE WLD VEHRTVAG VLS STFWTGG VML 

LALVGYLIRDWRWLLLAVTLPCAPGILSLWW 

VPESARWLLTQGHVKEAHRYLLHCARLNGR 

PVCEDSFSQEAVSKVAAGERWRRPSYLDLF 

RTPRJ-RHISLCCWVWFGVNFSYYGLSLDVS 

GLGLNVYQTQLLFGAVELPSKLLVYLSVRYA 

GRRLTQAGTLLGTALAFGTRLLVSSDMKSWS 

TVLAVMGKAFSEAAFTTAYLFTSELYPTVLR 

QTGMGLT ALVGRLGGSLAPL AALLDGV WLS 

LPKLTYGGIA1XAAGTALLLPETRQAQLPETI 

QDVERKSAPTSLQEEEMPMKQVQN 


1051 


2401 


A 


8759 


515 


1625 


EIRTPVAVSSAPSGDSEGDEEETTQDEVSSHTS 

EEDGGVVKVEKELENTEQPVGGNEWEHEV 

TGNLNSDPLLELCQCPLCQLDCGSREQLIAHV 

YQHTAAWSAKSYMCPVCGRALSSPGSLGR 

HLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAFSPPVYP 

AGILLVCNNCAAYRKLLEAQTPSVRKWALRR 

QNEPLEVRLQRLERERTAKKSRRDNETPEERE 

VRJRMRDREAKRLQRMQETDEQRARRLQRDR 

EAMRLKRANETPEKRQARLIREREAKRLKRR 

LEKMDMMUtAQFGQDPSAMAALAAEMNFF 

QLPVSGVELDSQ1XGKMAFEEQNSSSLH 


1052 


2402 


A 


8763 


1106 


70 


RHGHGGRDRRGGGRVARPGGLGRYPGRGAA 

ASLVFVPTRRRSGPSGTASVAAMAYHSGYGA 

HG SKHRARAAPDPPPLFDDTSGG YS SQPGG Y 

PATGADVAFSVNHLLGDPMANVAMAYGSSI 

ASHGKDMVHKELHRFVSVSKLKYFFAVDTA 

YVAKKLGLLVFPYTHQNWEVQYSRDAPLPP 

RQDLNAPDLYTPTMAFITYVLLAGMALGIQK 

RFSPEVLGLCASTALVWWMEVLALLLGLYL 

ATV RS DL STFHLL AYS G YK YVGMILS VLTGL 

LFGSDGYYVALAWTSSALMYFIVRSLRTAAL 

GPDSMGGPVPRQRLQLYLTLGAAAFQPLIIY 

WLTFHLVR 


1053 


2403 


A 


8768 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVF 
YFTS SSVNS SA YTIYMGKDK YENEDLIKHG W 
PEDIWFHVDJCLSSAHVYLRLHKGENIEDtPKE 
VLMDCAHLVKANSIQGCKMNNVNVVYTPW 
SNLKKTADMD VGQIGFHRQKD VKTVTVEKK 
VNEILNRLEKTKVERFPDLAAEKECRDREER 
NEKKAQIQEMXKJRJEKEEMKKKREMDELRSY 
SSLMKVENMSSNQDGNDSDEFM 


1054 


2404 


A 


8769 


344 


527 


REATTLACRNSCWVFSRCSLGACKPTVCSMP 
SLSRQGSQTLCLRL AE Y CMES VDSQRLLLS 
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1055 


2405 


A 


8770 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEK 
KMLKCVWGDGAVGKTCLLMSYANDAPPEE 
YVPTVFDHYAVTVTVGGKQHLLGLYDTAGQ 
EDYNQLRPLSYPNTDVFLICFSWNPASYHNV 
QEEWVPELKDCMPHVPY VLIGTQIDI .RDDPK 
TLARLLYNuKEKPLTYEHGVXLAKAIGAQCYL 
FXSALTQKGLKAVFDEAHTIFHPKKJCKKRCS 
EGHSCCSII 


1056 


2406 


A 


8773 


261 


332 


NPRIQLSGNSCCAGSCRVWLSEQ 


1057 


2407 


A 


8778 


3 


477 


PAGIRHEQARGADRMGKCRGLRTARKLRSH 
RRDQKWHDKQYKKAHLGTALKANPFGCAS 
HAKGIVLEKVGVEAKQPNSAIRKCVRVQLIK 
NGKKITAFVPNDGCLNFIEENDEVLVAGFGR 
KGHAVGDIPGVRFKVVKVANVSLLALYKGK 
KERPRS 


1058 


2408 


A 


8808 


171 


881 


PGLSQEPSGSMETVVIVAIGVLATIFLASFAAL 
VLVCRQRYCRPRDLLQRYDSKPIVDLIGAME 
TQSEPSELELDDVVITNPH1EA1LENEDWIEDA 
SGI MSHCIAII KICHTLTFKI VAMTMGSGAK 
MXTSASVSDIIWAKR1SPRVDDWKSMYPPL 
DPKLLDARTTALLLSVSHLVLVTRNACHLTG 
G LD W1DQSL S AAEEHLE VLRE AAL ASEPDKG 
LPGPEGFLQEQSAI 


1059 


2409 


A 


8809 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWC 

EGPRMLSWCPFYKVLLLVQTAIYSWGYASY 

LVWKDLGGGLGWPLALPLGLYAVQLTISWT 

VLVLFFTVHNPGLALLHLLLLYGLWSTALI 

WHPINKLAALLLLPYLAWLTVTSALTYHLWR 

DSLCPVHQPQPTEKSD 


1060 


2410 


A 


8810 


304 


381 


PKLSVYPLQSHHCLSEPFQSLVCCLA 


1061 


2411 


A 


8820 


1673 


848 


SCKTENLLEMWWFQQGLSFLFSALV1WTSAA 

FIFSYTTAVTLHHIDPALPYISDTGTVAPEKCLF 

GAMLNIAAVLCIATIYVRYKQVHALSPEENV1 

IKLNKAGLVLGILSCLGLSIVANFQKTTLFAA 

HVSGAVTTFGMGSLYMFVQTELSYQMQPKIH 

GKQVFWIRLLLVIWCGVSALSMLTCSSVLHS 

GNFGTDLEQKLHWNPEDKGYVLHMITTAAE 

WSMSFS FFGFFLTYIRDFQKISLR VEANLHGL 

TLYDTAPCHINNERTRLLSRDI 


1062 


2412 


A 


8824 


1 


763 


GGAPPASVPARESPVSGAQGSSRTRGHKRAA 

GARAPQLCSSWQRRSAPAMSRGLQLLLLSCA 

YSLAPATPEVKVACSEDVDLPCTAPWDPQVP 

YWSWVKLLEGGEERMETPQEDHLRGQHYH 

QKGQN GS FDAPNERP YSLFORNTTS CNSGTYR 

CTLQDPDGQRNLSGKVILRVTGCPAQRKEET 

FKKYRAEIVLLLALVIFYLTLIIFTCKFARLQSI 

FPDFSKAGMERAFLPVTSPTWCHLGLVTPHKT 

ELV 


1063 


2413 


A' 


8826 


147 


627 


CETSTSSAGHAPCRHAAQGPPAEPTGLRIXSE 

HQRLHAWPPGPRRPSLWPPKNGKWHSGKRT 

AGGRPQRRPSRRQSQRPSAWSGSPRMHSPGQ 

KCSLMCPHRSQDSL STABFQRSPG ANTORALH 

CVLSKEMKSVQRSLGLSRIHLQSKRKIIHFVL 

TR 


1064 


2414 


A 


8835 


2982 


1869 


LIOyrUCSQMTQEASDEAEDMKEAMNRMIDE 
LNKQ VSELS QLYKE AQ AELED YRKRKSLED V 
TAEYH IKAEI IEKL M Q LTNVSRAKAED ALS E 
MKSQY SKVLNELTQLKQL VDA QKENS VSITE 
HLQ VITTLRTAAKEMEEKI SNLKEHL ASKE VE 
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VAKJLEKQLLEEKAAMTD AMVPRSS YEKL QS 

SLESEVSVLASKLKESVKEKEKVHSEWQIRS 

EVSQVKREKENIQTLLKSK£QEVNELLQKPQ 

QAQEELAEMKRYSESSSKLEEDKDKKINEMS 

KEVTKLKEALNSLSQLSYSTSSSKRQSQQLEA 

LQQQVKQLQNQLAECKKQHQEVISYYRMHL 

LYAVQGQMDEDVQKVLKQILTMCKNQSQ1C 

K 


1065 


2415 


A 


8841 


3 


663 


AAATAASLSPRGCRLRTPSSDVGPSRAPPPSA 

APLPTGRAQMSPSGRLCLLTIVGLILPTRGQTL 

KDTTSSSSADATIMDIQVPTRAPDAVYTELQP 

TSPTPTWPADETPQPQTQTQQLE GTDGPL VT 

DPETHKSTKAAHPTDDTTTLSERPSPSTDVQT 

DPQTLKPSGFHEDDPFFYDEHTLRKRGLLVA 

AVLFJTGIIILTSGKCRQLSRLCRNHCR 


1066 


2416 


A 


8853 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFG 

RRRRRGRWSRKKMSLKSERRGIHVDQSDLL 

CKKGCGYYGNPAWQGFCSKCWREEYHKAR 

QKQIQEDWELAERLQREEEEAFASSQSSQGA 

QSLTFS1CFEEKKTNEKTRKVTTVKKFFSASSR 

VGSKKEIQEAKAPSPSrNRQTSIETDRVSKEFIE 

FLKTFI IKTGQEI YKQTKLFLEG MH YKRDLSIE 

EQSECAQDFYHNVAERMQTRGKVPPERVEKJ 

MDQIEKYIMTRI.YKYVFCPETTDDEKKDLAI 

QKRIRALRWVTPQMLCVPVNEDIPEVSDMVV 

KAITDnEMDSKRVPRDKLACITKCSKHIFNAI 

KITKNEPASADDFLPTLIYTVLKGNPPRLQSNI 

QYITRFCNPSRLMTGEDGYYFTNLCCAVAFIE 

KLDAQSLNLSQEDFDRYMSGQTSPRKQEAES 

WSPDACLGVKQMYKNLDLLSQLNERQERIM 

NEAKKLEKDLIDWTDGIAREVQD1VEKYPLEI 

KPPNQPL AATDS EN VENDKLPPPLQPQVYAG 


1067 


2417 


A 


8855 


1372 


1513 


SNMREVGCGWLVPVIPAFWEAEVGGSLEARS 
LRQ A W ATKQDPI SKKK 


1068 


2418 


A 


8856 


1530 


1583 


PCRPGMECNSMISVHCNL 


1069 


2419 


A 


8857 


1530 


1583 


PCRPGMECNSMISVHCNL 


1070 


2420 


A 


8866 


293 


1675 


P YPQGGYPQGPYPQEG YPQG P YPQGG YPQGP 

YPQSPFPPNPYGQPQVFPGQDPDSPQHGNYQ 

EEGPPSYYDNQDFPATNWDDKSIRQAFIRKVF 

LVLTLQLSVTLSTVSVFTFVAEVKGFVRENV 

WTYYVSYAVFFISLIVLSCCGDFRJRKHPWNL 

VALSVLTASLSYMVGMJASFYNTEAVIMAVG 

rTTAVCFTVVIFSMOTRYDFTSCMGVLLVSM 

VVLFIFAILCIFIRNRILE1VYASLGALLFTCFLA 

VDTQLLXONKQLSLSPEEYVFAALNLYTDHNI 

FLYILTIIGRAKE'PSSSSLCPLRWHGWPGPCP 

WHGSASCTSPLSCPQAQPREKDASLQPSCMY 

TADTSIWTRCGH SMAPLVLPPPPRGTKATFPC 

HLLSTHCCMSPVCQPTPGTGGSTRSRGEGLSQ 

EVRVHVFPPVPAPQPGVEHPSPPPHPPGVLPS 

GDMRSGGLIPVLSPE 


1071 


2421 


A 


8868 


2 


358 


ARGNTLYHLPRLCRKLNLRWFSASTLYDVQH 
DDKMGSNTFFKRNDCRYVM1SCKADMAYDN 
VRHPFMI* SI^KLIMEETYLNIIKAVYDRFTASU 
LNGEKLKVFPVRSGT*QGCSVWP 


1072 


2422 


A 


8870 


33 


658 


MESVLSKYEDQITIFTDYLEEYPDTDELV\VTL 
GKQHLLKTEKSKLLSDISARLWFTYRRKFSPI 
GGTGPSSDAGWGCMLRCGQMMLAQALICRH 
LGRDWS WEKQKEQPKEYQRILQCFLDRKDC 
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CYSIHQMAQMGVGEGKS1GEWVLGPNTVU.Q 
GV*KNLA\LFDEW\NSLGLVYVSM\DNPyGSIA 
RFPKKLCRVLPL\SADTAGLTGP 


1073 


2423 


A 


8879 


146 


412 


DFSV*GDVDIEVTCPICLQLLTEPLSLNCGLRL 
*QVCITA*IKESVnSGG*SSSPVCHTTFQPANL 
RTSRYLPT*SIKSLGPDEPQEG 


1074 


2424 


A 


8884 


67 


435 


HLQGRSIRTLQLTGENEfCNCEVSERIRRSGPW 
KE1SFGDYICHTFQGDCWADRSPLHEAAAHG 
RJXALKTLIAQGVNVNLWTUDRVSSLHEACL 
* GPV ACAKPY WKMVPRHGGTVTGPPLLMV 


1075 


2425 


A 


8896 


1294 


248 


RSGDRNGLTHQLGGLSQGSRNQSYRSRSRSR 

SRERPSAPRGIPFASASSSVYYGSYSRPYGSDK 

PWPSLLDKEREESLRQKRLSERERIGELGAPE 

VWGLSPKNPEPDSDEHTPVEDEEPKKSTTSAS 

TSEEEKKXKSSRSKERSKKi<RKKKSSKRKHK 

KYSEDSDSDSDSETDSSDEDNKRRAKKAKKK 

EKKKKHRSKK YKKXRSKKSRKESSDS SSKES 

QEEFLENPWKDRTKAEEPSDLIGPEAPKTLTS 

QDDKPLNYGHALLPGEGAAMAEYVKAGKRI 

PRRGEIGLTR*RNCHHLNAQVM**VVSRHRR 

MEAVRTAKREPESTVLMRREPLHPFNPRRET 

KERB 


1076 


2426 


A 


8899 


146 


789 


GRSTEAEKEPAFDERTGKGRRLPRAGEFHG* E 

* APGPGPRSFQ VSRKMPEEVPPGARKHPFSG KS 

FYLDLPAGKNLQFLTGAIQQLGGVIEGFLSKE 

VSYrVSSRREVKAESSGKSHRGCPSPSPSEVR 

VETSAMVDPKGSHPRPSRKPVDSVPLSRGKE 

LLQKAIRNQK* ♦CTVQQLSHCRLY\GEKTTAK 

RSQREHVQQQSQEHGKWPDLKGPR 


1077 


2427 


A 


8901 


352 


3 


AKIGAYKYIQELWRKKQSDVMHFLLRVRCW 
QYPALHRAGTEWQLSALHRAPRSTQPDKAC 
RLGYKAKQGYIIYRICVRRGOWKCPVPKAVT 
\YGKPVHHGVN*LKFAQSLQSVAEEQ 


1078 


2428 


A 


8905 


536 


781 


ACPAENREVPEMAAGQAPHAGPGAGPGQPA 
PALPFAATPGSRGQALCRGGRRRQHLHGPLH 
RP*QAAPALHAGCQLAPHPPT 


1079 


2429 


A 


8912 


121 


376 


NLI WKLCVTERRLVILDNYDLASE/YEANKYI 
CNRHQFKPGQDKYFTLGLPTGSTPL'CYPKLI 
EYNKNGHLSFKYVXTFSMDEY 


1080 


2430 


A 


8920 


381 


1788 


SSESPSDPGRMAMTWIVFSLWPLTVFMGHIG 

GHSLFSCEPITLRMCQDLPYNTTFMPNLLr4HY 

DQQTAALAMEPFHPMVNLDCSRDFRPFLCAL 

YAPICMEYGRVTLPCRRLCQRAYSECSKLME 

MFGVPWPEDMECSRFPDCDEPYPRLVDLNLA 

GEPTEGAPVAVQRDYGFWCPRELKIDPDLGY 

SFLHVRDCSPPCPNMYFRREBLSFARYFIGLIS 

IICLSATLFTFVTFLroVTRFRYPERPIKCYAV 

WHMMVSLIFFXIGFLLEDRV ACNA\S IPAQYKA 

STVTQGSHNKACTMLFM1LYTFTMAGSVWW 

VILTITWFLAAWKWGSEAIEKKALLFHASA 

WGIPGTLTDLLAMNK1EGDN1SGVCFVGLYD 

VDALRYFVLAPLCLYVWGVSLLLAGIISLNR 

VRffiIPL* KENQDKL VKFMIRIG VPSIL YL VPLL 

W1GCYFYEQAYRGIWETTWIQERC 


1081 


2431 


A 


8922 


56 


420 


EERTKMSTGPDVKATVGDISSDGNLNVAQEE 
CSRKGIVDEFFPLLSN*CIWTQPQGYPQSSYG 
TLANFVFXCSVRHGLALILQLCNFSIYTQQMN 
L S 1 AJ P AMVNNT APPS QPN A S TERP ST 


1082 


2432 


A 


8923 


355 


1079 


PFGTPSSTMAVVKNKCLMKGGKXGVKKKVV 
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GPFSKKDQYDVKAPAMFNIRNTGK/TLVART 

QGTQIASDGLKGLLFEVSLADLQNDEVAFRK 

FXLITEDVQDKNCLTT^FYGMDLTCDKJCSMV 

EKWSTMIEAHVDVKTTDGYFFHLFCVGFTKK 

HNNQILKTSYA*HQQS/RQIQKKMMEIMT*EV 

QTNDLKEWNKLIPDNIGKDTEKV/CPIYPLH 

DVFIRKVKMLENPGFER\MELRGGOSSS 


1083 


2433 


A. 


8948 


28 


385 


LTV^QPHIPSCPAMSEETLQSKJLAAAKKKLP 
WGAVQGSRAMSDLLLLLLDLTLLLLXMLLGF 
AGYSGOLAGVAVSAGSPPI/RYKFHVEPYGET 
GWLLT/ESCSISPKLCSIAVH*DNPAWF 


1084 


2434 


A 


8950 


156 


318 


HYTPINTDTIENSENNKCW*GY*E\VGLIHHW 
WGGKRVQPFWKRVWQKRTLNLRV 


1085 


2435 


A 


8956 


16 


413 


HMGQLGYFIQCWWECKRLISF\WKTI*QSPAK 

*TIYTSYDTAIPIS/GI/YPKRMSSKCHQETCAR 

MFILAPFTATIKGKQLTCPLVEERIDYVMWYS 

HKYYTKVKRNL+VTITH\TWVNLN1LMFEIILW 

YSHKYY 


1086 


2436 


A 


8962 


86S 


1026 


H*KILQVGRAQRAHXSRL*SQLLRRLRHESHL 
NPGARGCSEARLHRCTPAWTT 


luo / 


/ 






JO 




♦ERSVCAFHVCIQTYVCLQVYACMCVYYICM 
FVYSVYGCGLCTCVCMDVYICVCVQEFL 


1088 


2438 


A 


8989 


394 


404 


N*KWILHVNVRIQSIFF/IKRNQK/INSHELKLD 

VTCPJ rt\A\A^IA* < ^TKKUnV\ n/T IKFVT7T r<!A 
i\J\^iji^rviivioiN/v o i rv jviri_^ L^irvi rv i / i_^v_, o/v 

KYTVKRJKIHPTDLEKMLRNHLSDKD*YS/GV 

YKDLSKLNRRKTE/S*/VKKWVKDLSRYFIKE 

VISMENKHKJCIFSTS 


1 uo? 




A 
AV 




fin 


329 


MAT TPFSPSSFPOT A ATGSSVPFPPGOPTsfAT] 
N S S WDSPTEPS SLEDLEATGTIGTLLSDMG W 
GVEDNAYTLEVNSRYMRAVGIM*IHL 






A 


0770 




DJ 1 


WQESKFIQAFWSKIQQYLATSIHILFDPAFLFL 

GGYPGGTQSVFLTGVLVSSVFYNNIKMLHTR 

LL1AALFIIVQYWKQSKDHYI 


1091 


2441 


A 


8997 


97 


454 


YPLPVCSYLSGPRGEHWNSLGGKSSCPLPLPT 
L VSSRFKISKVI WGDLS VGKTCLINR* GGAG 
AELGRVGPSLARWAGSRSQHLVPSQWCKDS 
FDKNYKAPIGADFEMERFEVLGLPF 


1092 


2442 


A 


8999 


548 


811 


S SFIKivHILIFEDD WHQTTCCF1HPHHP\F* RCQ 
FHFYVSVQNSISPSLSVSSSHPDRPDHEVHQH 
RAAIIHHQHG QGPLGHGLV ARVG 


1093 


2443 


A 


9002 


3 


2745 


ALLGLQQPAQS LILSRS SVMG VRGLQGFVGS 

TCPmCTVVNFKXUVEHHRSKYPGCTPTTVVD 

AMCCLRYWYTPESWICGGQWREYFSALRDF 

VKTFTAAGIKXIFFFDGMVEQDKRDEWVKRR 

LKKNREISRJFHYIXSHKEQPGRNMFFIPSGLA 

VFTRFALKTLGQETLCSLQEADYEVASYGLQ 

HNCLG1LGEDTDYLIYDTCPYFSISELCLESLD 

TVMXCREKLCESLGLCVADLPLLACLLGNDI1 

PEGMFESFRYKCLSSYTSVKENFDKKGNIILA 

VSDHISKVLYLYQGEKKLEEJXPLATKQSSFL 

*RNGIISFTRT/INLHGFSKKPKV**LWTNK*YP 

RVQ1TNPGKKFPCVQMLNPGKKFPCVQALNP 

GEKJTCIHI/PEPRQEVPTCSDPEPRQEVPTCTG 

PESRREVPMCSDPEPRQEVPMCTGPEFRQEVP 

MCTGPEARQEVPMCTDSEPRQEVPMCTDSEP 

RQE VPMYTGSEPRQE VPMYTGPE SRQE VPMY 

TGPESRQEVLIRTDPESRQEIMCTGHESKQEV 
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PICTDPISKQEDSMCTHAEINQKLPVATDFEFK 

LEALMCTTNPErKQEDPTNVGPEVKQQV'I'MVS 

DTEELKVARTHHVQAESYLVYNIMSSGEIECS 

NTLEDELDQALPSQAFIYRPIRQRVYSLLLED 

CQDVTSTCLAVKEWFVYPGNPLRHPDLVRPL 

QMTDPGGTPSLKILWLNQEPEIQVRRLDTLLA 

CFNLSSSREELQAVESPFQALCCLLIYLFVQV 

DTLCLEDLHAFIAQALCLQGKSTSQLVNLQP 

DYTNPRAVQLGSLLVRGLTTLVLVNSACGFP 

WKTSDFMPWNVFDGKJLFHQKYLQSEKGYA 

VEVL/CRTK*ISAHQIPQPEGSRLQGLHEGEQT 

HHWPSPLGLTPRREVGKTGLQLPQDGLWV 


1094 


2444 


A 


9021 


97 


834 


AREACRAKTDFPGRRFRLWPSCCCRVrVGAE 

T* H\MAEPVSPLKHFVLAKKAITAIFDQLLEFV 

TEGSHFVEATYKNPELDRIATEDDLVEMQGY 

KDKLSIIGEVLSRRHMKVAFFGRTSSGKSSVI 

NAMLWDK.VLPSGIGHITNCFLSVEGTDGDKA 

YLMTEGSDEKXSVKTVNQLAHALHMDKDLK 

AGCLVRVFWPKAKCALLRJDDLVLVDGPGTD 

VTTELDSWIDKFCTKSSTREITNSGSDT 


1095 


2445 


A 


9022 


1 


537 


LVLNSRVEDFVPPEGAGR71>PFALRPLAACW 
LLHRRARRSSALCPRPRSWGVSGGEGAGARE 
P*ITSSSCCLSAA/SHLSIQSPNMAGARRR1RPQ 
LAKEKXEGCHICTSVTPGEPQVFLGKDKAFTF 
DYVFOIDSOQEQIYIQCIEKLIEGCFEGYNATV 
FAYGQTVGAGKTYTMGTGFD 


1096 


2446 


A 


9029 


1 


285 


FFFFNVCKSPKVPKPG CKEESTGTLFKNTLISL 
G QH SETPSLKKKXL AG Y SGMCL* SQ VLRRLRQ 
EDCLSPGGGNCRES* SCPYTPAWITERDPV 


1097 


2447 


A 


9032 


716 


357 


ARSTGFWGEILWCGFLKRSLALSPRVKCSGAI 
LAHCNFRHAGFPPLSCLSLPNRWEYRRPPARP 
GKFFLVFLVETGFQC/G*DGLDLLTSRSACLG 
LPKCWDYRRJEPAASIIFQTTFFINSK 


1098 


2448 


A 


9038 


230 


652 


KVVVMSCEDIMSGSFYRNKLKYLAFLCKRTS 
TNPS QGP YHL WVP SH1FWQTTCGRL PHKTK Q 
G*AALDHLKVFDRTPLPYDKKXQMAVSATLE 
WRPKP* RKF AYLGH WAQKVDWKYQ AMT A 
TMGEKRKVYYQKICYQKX 


1099 


2449 


A 


9043 


185 


372 


I1F YSHQQCMRVAVQGCGDIETLIHC W* E*KII 
HSL/WK/TV*QFLKRLYLHLPHNSV1AFLGISP 
RKIKTCPQNSCTSMLINA1HNDQKWKKINI 


1100 


2450 


A 


9045 


763 


584 


RQSLALSPRLECSGTISAHCRLCPLVFTPLSCL 
SLTSSWDYRRPPPHPANFLYFK*RRGF 


1101 


2451 


A 


9050 


275 


2 


LFFLRKVSNQFLSPSLLPVNFQG FVF AFLLLLL 
FLL/FEMESLPVA/RVECSGTISAHCNLCLPGSS 
DSPASAS*VAGITDMCRYTQLILFHAS 


1102 


2452 


A 


9053 


449 


1224 


KTSMFWKFDLHSSSHIDTLLEREDVTLKF-.I .M 

DEEDVLQECKAQNRKLEEFLLKAECLEDLVSF 

I\* EEPPQDMDEKIRYKYPNISCELLTSDVSQM 

NDRLGEDESLLMKLYSFLLNDSPLNPLLASFF 

SKVLSILISRKPEQIVDFLKKKHDFVDLIIKHIG 

TSAIMDLLLRLLTCIEPPQPRQDVLN/WFKVQ 

RNL*HST*NVMDISKYVNLHWGLNKSHSLL* 

LLLQCVLQWLNEEKIIQRLVEIVHPSQEEDVS 

SLV 


1103 


2453 


A 


9058 


403 


3 


GLHVYDFQVYREHILTLNVKJCCSVSFWGLRE 
WLYLQMYEriKSPRFPIIKMTDITKCW*GC\GA 
A GM QI/H/CW\ W C VN V G KF W EM S * YYLL KLS I 
ST/PYDPAIPLLGIYT-^ETRVYIHPKTCMKMLIA 
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APFVLAVNC 


1 104 


2454 


A 


9064 


75 


393 


KWLFSSLNITGRGDIIGHLKWLDCR\NCSSFPI 
KRNRQTHSTESNKLKAGHSFGYN*LIH*NS\V 
KTDCGCGANSKGVVVVMKVXKTAQQKQTTS 
YMQIGTTKNSRAT 


1105 


2455 


A 


9065 


366 


778 


DLLILRNLAFPELKRRNCISRFYLAYHLHKIYS 
RSILLCNNCSGFYILSL* QYDVFFFNYFFFRDR 
AWPCCPGWSAAWLTIVILAHYRRPGLERSCC 
LSLSSSWDHRRVPPCPANF*/YFSMGFTAFPRL 
VLNS'TQGI 


1106 


2456 


A 


9083 


673 


816 


ESGSL1H*WWENKPAQPLWWEI*QHVQKLPT 
I IFrCDPAIPLLGICPED 


1107 


2457 


A 


9086 


580 


18 


KPSSGSFIRAIYIFLSTAHVPALFSVLVRTKLT* 

VRENFLPIRLAKILKLDNVKCWQG/SGSNMSL 
I/HC WWE YNVIHII WN S VTFPRKVEH VYITYA 
PEISVR*IHGGLPTLVHQETHTSVFRGAPSVIP 
ETRVCRPTKESiNKLLHIYTMEHYGDENK 


1108 


2458 


A 


9093 


540 


1 


GGNDCS VTTTTEPGRKJ-:ir* KRKF* EK.TDRLP 
fiA/PP < vRTPPTPYPC , PHf"Vr)RI I PPSRPT PAfiPA 
S AFPPAERSRGHRRASL* RAR WS AA VPRRS A 
GSASEPVQSRWLRLPVGSDSPPAVPVRVCPAP 
DSRPAAPGSRLPDPGLDSPAPSRTPSSSVD*GG 
QRPPPPSGDSLSPPGCCRY 


1109 


2459 


A 


9099 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG*KWPSWL 
GAVAHSCNPSTLVGRGGRJTRGQELR 


1110 


2460 


A 


9103 


242 


70 


EEQFFFFAVGMFP*VDFLAPASGELWDRLRLT 
CSRPFTRI IQSFGLAFLRVCSSLDSLDDSWGP 
SALLSS VL/NQGGRN VLE AREAAKHPTI* RQS 
LLRKORNKRMAIP 


1111 


2461 


A 


9110 


189 


121 


SFLSVRJLECNGATMAHCALPLPG 


1112 


2462 


A 


9113 


100 


910 


RRRGGGSRPRRTPVPAPGPGPSFGMDVRFYP 

AAAGDPASLDFAQCLGYYGYSKFGNNNNYM 

NMAEANNAFFAASEQTFHTPSLGDEEFEIPPIT 

PPPFSDPAI DMPDVT T PFOAI SnPT PSOOSFFT 

PQFPPQSLDLPSmSRNLVEQDGVLHSSGLHM 

DQSHTQVSQYRQDPSLIMR\PSST*PDAARSG 

VMPPAQLTTINQSQLSAQLGLNLGGASMPHT 

SPSPPASKSATPSPSSSINEEDADEANRAIGEK 

RAAPDSGKKPKTPKK 


1113 


2463 


A 


9120 


3452 


3051 


FLRPSFALVPQAGVQWCALSWLQPPSPRFK* F 
SCLSLPSS WD YRHVPPRPANFFVLL VETGFLH 
VGQAGHEPLTSGDPPASASQSAGITGVSHQA 
WPSFFIFSRDTVLLCCSGWSRTSGLKQSACLS 
LLKCWDY 


1114 


2464 


A 


9122 


152 


377 


NQLPLQQWTFFIYETGFCSVAQAGVQCRDHS 
SLHP*PPG\SSDPPAPPS*VLGITGQRYHACLn 
YLYVQTVPQRV 


1115 


2465 


A 


9124 


553 


981 


QRPLLRQQLGSWPTCRSLEGDLASPW**RLPG 

SPRMRRSGT/ATLNLPLSPQGTVRTAVEFQV^l 

TQTQSLSFLLGSSASLDCGFSMAPGLDLISVE 

WRLQHKGRGRGDLHLPDHHLSVPSSADHPA 

QQPSQFNGRNLYFLPLFR 


1116 


2466 


A 


9135 


48 . 


410 


SASHEPAEHDGGADSLSASQPPRPAGRPAGA 

QHVHVPPWTDVLAGQDRRAPTAGDGAPWP 

APGGHVPSTRPHDPAEFHADEAAGRGGRGLQ 

P AAPHALP AGLFHGPP AP A/P AEGG GTP* GS A 

GAGGP*GSPAGRACGAAGCRPRPPRPAASSA 

*NSAGS*GLVEGT*PPGAGHGAPSPAVGARLS 
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M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R— Arginine, S=Serine, 
T=Threonine, V— Valine, W-Tryptophan, 
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CP ARFS VQG GT WTC* APAGRP AGLG G WE AE 
RES APPSC SAGS* D AD* G AEPWG AGSRS WGS 


1117 


2467 


A 




380 




K*?r;RWAifFn oppTPPnp(™PTPVrtPRWi<r<ir>rp 

cvovjn w rv rvi --L^n^i ivlt t xvt v. riL- v vjrn v> rvoL/i^r 

TCPGAVPRAPGTLPQGSLXDSFPDLLSLVAED 

*CCLMASEASWTTr\ELWVTLTVEGKSVP/CL 

NTEATHSTLPSFQGPVSLASITVVGIDGQASKP 

LKTPQLWCQLGQYSFMHYFLVIPTCPVPLLG* 

GILTKLSAFLTIPRLQPHLIAALSPSS 


1118 


2468 


A 


9154 


471 


2 


AAGQVWEVTSHLYLCnSDAAGLRIXPPAES 
ERGEGGHCPAEAPLPPRPQYCLAKHPLLRKLP 
EFKIKLDPYLTQHTKJNSKQfKYLSA'-RAKTTQ 
LVEGNIG VNLQNTELKQI I'INGFLDTTPEAQE 
TXEKTNKLNFrKKVKRQLAEWEKIFQIA 


1119 


2469 


A 


9155 


2 


3187 


ACPRLARRRRRVRSLRRRRGWLRARWSRGQ 

NNMAARRJTQETFDAVLQEKAKRYHMDASG 

EAVSETLQFKAQDLLRAVPRSRAEMYDDVHS 

DGRYSLSGSVAHSRDAGRESLRSDVFSGPSFR 

SSNPSISDDSYPRKECGRDLEFSHSNSRDQVIG 

HRKLGH FRSQDWKF ALRGS WEQDFGHPVSQ 

ESS WSQEYSFGPSA VLGDFG S SRJLIEKECLEK 

ESRDYDVDHPGEADSV/LRGGSQVQARGRAL 

NIVDQEGSLLGKGETQGLLTAKGGVGKLVTL 

RN V STK K I PT VNRTTPK.TQ GTNQ I QKNTPS PD 

VTLGTNPGTEDIQFPIQKIPLGLDLKNLRLPRR 

KMSFDIIDKSDVFSRFGIEIIKWAGFHTIKDDIK 

FSQLFQTLFELETETCAKMLASFKCSLKPEHR 

DFCFFTIKFLKHSALKTPR\T>NEFLNMLLDKG 

AVKTKNCFFEIIKPFDKYIMRLQDRLLKSVTP 

LLMACNAYELSVKMKTLSNPLDLALALETTN 

SLCRKSLALLGQTFSLASSFRQEKIL*AVGLQ 

DIAPSPAAFPNFEDSTLFGREYIDHLKAWLVS 

SGCPLQVKKAEPEPMREEEKMIPPTKPEIQAK 

APSSLSDAVPQRADHRWGTIDQLVKRVTEGS 

LSPKERTLLKEDPAY WFLSDENSLEYKYYKL 

KLAEMQRMSENLRGADQKPTSADCAVRAML 

YSRAVRNLKKKLLP\WQRRGLLRAQG\LRG\ 

WfTf A O D AVTrrTHTI T FT VTJUHD /~\ A T>/~1 

WK^AKKAM 1G 1 Vl^^rLKArOljKrlrlLiKvJArO 
i cpi a vpey pr»i}'\rr^Ad'i<r'rir*ppriPVrtPCPO'r>PCT 

EA S GPSPKPA GVDISEAPQTSSPCPSADIDMKT 
MET AEKL ARFV AQ VGPEIEQF SIENSTDNPDL 
WFT T-rnO"M^*sAFKFY1iKKVFFT PPSTPFT^PH 
NLHTGGGDTTGSQESPVDLMEGEAEFEDEPP 
PREAELESPEVMPEEEDEDDEDGGEEAPAPG 
GAGKSEGSTPADGLPGEAAEDDLAGAPALSQ 
ASSGTCFPRKRI S SKSLKVGMIPAPKR VCLIQE 
PK.GECPPVGTVASSTVLGWWAVRVRRDRWR 
HFNPKEFCAPLQNVSRHSCFPW 


1120 


2470 


A 


9163 


124 


207 


PPRACRPCPRACPCPPT*KCSQPVSWPC 


1121 


2471 


A 


9166 


272 


523 


PMSSLQOCFYTFKCIIFKGIFLLLISNI JAF* *EK 

V/CSHITDSLKFIGKGWVGMVTHACNPGTLG 

G*GGWIA*VREFETSLGNM 


1122 


2472 


C 


9170 


442 


236 


MNRRRFLRPADCHSGMRGTENGACSEGESQI 
HCGAGGEGVQLVHWNQPENGCLQFDSTHIT 
FSKRQN* 


1123 


2473 


A 


9171 


10 


423 


MVDRSPLLTSVIIFYLA1GAAIFEVLEEPHWKE 
AKKNYYTQKLHLLKEFPCLGQEGLDKILEW 
SDAAGQGVAITGNQTFNNWKWFNAMIFAAT 
VITTIGYGNVASKTPGGRLFCGFYGLFGVPFC 
LTWINALGKFFG 
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nucleotide 

location 
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Amino acid sequence (A^Alanine C— Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenyl alanine, G=Glycine, H=Histidine, 
I=Iso leu cine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
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Y=Tyrosinc, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


1124 


2474 


A 


9173 


3 


374 


GP SPSLL VLLPQEPG GTGTPVRAG AG AGMWL 
WEDQGGLLGPFSFLMLMLLLETRNPVNACLL 
TGSLFVLLGVFSFEPVP SCRALQELKPRDRJS A 
1AHRGGRHDPPENTLGAIR/QGS* * WSNRR 


1 125 


2475 


A 


9179 


704 


188 


ESSSGLLFQCFQGIHVQKLTLQARPTLFSWWL 

CSKPPKETGELENAESGGDGGRRGGKQDNV 

AWWRRM\QKG\DFPWDDEDFPQSGPFGGQA 

LPMGFFYLYFRDPGRErTWKHFVQYYLARGL 

VDRLEVVNKQSVRVIPAPGTSSEVRGEFKAE 

YCRHKFISCKNWFYFFQ 


1126 


2476 


A 


9183 


153 


233 


MEYMAESTDRSPGHILCCECG VPI SPN 


1127 


2477 


A 


9185 


1 


321 


LTGQLGSILLRVFSKSRAGLGARKLKAYRTM 

FYMAPSTDR^PGT-TTl CT'FPGVPTST'MPAnwrv 
i ivi/t. i >o i L/rvor vji jjj_/\^\ W '*_iV_#vj » x iox nr **i< l w>- v 

ACLRSSFHIYHCIPKLFIHPFSKTSSSAFITPSHY 
LTFFSTIS 


1128 


2478 


A 


9186 


183 


847 


VLKFLLLQTMDEQSQGMQGPPVPQFQPQKAL 

R PHM fiVNTl A\JFRTFK , K - Tf1RGf)\F < IFWR A AP 

IVx ivivj I It 1 L/V.^ l IVl ILIVIVl vJrVVJ ^roLV I J\J\f\\*. 

L\LDGVPVALKKVQIFDLMDAKARADCIKEID 
LLKQLNHPNVIKYYASFIEDNELNIVLELADA 
GDLSRMIKHFKKQKRLIPERTVWKYFVQLCS 
AI Fl-TMHSRRVX/TWRnTKPAWVFrTATGVVK'I Ci 

DLGLGRFFSSKTTAAHSLVGTPYYMSPERIHD 
NG 


1129 


2479 


A 


9190 


1 


370 


GTSWKIPSAAVSESSPNGAAYASGLFCGVRG 
PPWAGLALI J>SPTLM AI XRRPTVSSDLDNIDT 
R ATRKTR VVATTTR ARIFDMRHSATAI TRPD 
ATTAQIPKLPVTTVCNRRANPGIPFSVL 


1130 


2480 


A 


9194 


131 


487 


AYLKRLPVPESITGFARLTVSEWLRLLPFLGV 
LALLGYLAVRPFLPKKKQQKDSLINLKIQKEN 
PKVVNEINIEDLCLTKAAYCRCWRSKTFPAC 
DGSHNKHNELTGDNVGPLILKJCKE 


1131 


2481 


A 


9201 


184 


605 


KEL VDEKSERGRAMDP VS QLAS AGTFR VLKE 
PLAFLRALELLFAIFAFATCGGYSGGLRLSVD 
CVNKTESNLSIDIAFAYPFRLHQVTFEGVPTCE 
GKERHKLALIGDSSSSAEFFGTVAGFAFLYSL 

A ATfrVYTFFON K Y 


1132 


2482 


A 


9206 


1 


852 


GGGRAGAGSRDMGSTDSKLNFRKAVIQLTTK 

TQPVEATDDAFWDQFWADTATSVQDVFALV 

PAAEIRAVREESPSNLATLCYKAYEKLVQGA 

ESGCHSEKEKQIVLNCSRLLTRVLPYIFEDPD 

WRGFFWSTVPGAGRGGQGEEDDEHARPLAE 

SLLLA1ADLLFCPDFTVQSHRRSTVDSAEDVH 

SLDSCEYIWEAGVGFAHSPQPNYTHDMNRME 

LLKLLLTCFSEAMYLPPAPESWQH/RTHWFSS 

FVSSENRHALPLrrSLLNTVCAYDPVEYGIPY 

NHLY 


1133 


2483 


A 


9208 


1165 


1463 


GPRARVQGFSGADIVKFMALGSMYLVLTLIV 
AKVLRGAEPCCGPLKNRVLRPCPLP/VPLPPP 
HPQPSRGNTVGCLPTYKVVYKLLSWPLHSNS 
NVYFIV 


1134 


2484 


A 


9210 


66 


1586 


MAGAGPKJ<RA1^APVAEEK£EAREKIMAAK 
RADGAAPAGEGEGVTLQGNITLLKGVAVIVV 
AJMG SGrFVTPTG VLKE AG SPGL AL W W AAC 
G VFSrVG ALCY AELGTTI SXSGGD YA YMLX) V 
YGSLPAFLKLWIELLIIRPSSQY1VALVFATYL 
LKPLFPTCPVPEEAAKLVACLCVLLLTAVNC 
YSVKAATRVQDAFAAAKLLALALIILLGFVQI 
GKGDVSNLDPNFSFEGTKLDVGNIVLALYSG 
LFA Y.GGWN YLNFVTEEMJNP Y RNLPLAI1 ISLP 
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D=Aspartic Acid, E=GIutamic Acid, 
F=Pheny I alanine, G=Glycine, H=Histidtne, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














TVTI .VYVLTNl .AYFTTLSTEOMLS SEAVAVDF 

GNYHLG VM S W IIP VFVGLSCFGS VNGSLFTS S 

RLFFVGSREGHLPSILSMIHPQLLTPVPSLVFT 

CVMTLFYAFSKDIFSVINFFSFFNWLCVAJLAI1 

GM1WLRHRKPELERJPIKVNLALPVFFILACLF 

LIAVSFWKTTPWSVASDFT1ILSGLPVYFFGV 

WWKNKPKWAPPGHLSPRPSCVRSSCMVVPQ 


1135 


24S5 


A 


9216 


40 


410 


RDRLPPAYFCRPWCVVTALDVGXSPESQEM 
DLV AFEDV A VNFTQEE W SLLDPSQKNLYREV 
MQETLRNLASIGEKWKDQNIEDQYICNPRKNL 
RSLLGERVDENTEENHCGETS S QIPDDTLNK 


1 136 


2486 


A 




i 




KjCK.KK.iK I KKL^olsJ rKrurLAV oMrnAr Ivru 
DLVFAKMKGYPHWPARIDDIADGAVKPPPN 
KYPIFFFGTHETAFLGPKDLFPYDKCKDKYGK 

SEAFEANPADGSDADEDDEGVRGVMAVTAVT 
ATAASDRMESDSDSDKSSDNSGLKRKTPALK 
MSVSKRARKASSDLDQASVSPSEEENSESSSE 
SEKTSDQDFTPEKKAAVRAPRRGPLGGRKKK 
APSASDSDSKADSDGAKPEPVAMARSASSSSS 
SSSSSDSDVSVKKPPRGRKPAEKJLPKPRGRK 
PKPERPPSSSSSD 


1137 


2487 


A 


9229 


21 


239 


LFPRLECRDPVTVNCTLNLPGSKNAPTTASQV 
GSTWNYRGGLPHPTNFFVKTGFRCSQAGLKL 
RGSREPPAWA 


1138 


2488 


A 


9231 


1664 


2 


TRS VGVNTCEVG WTEPECLGPCE PGTSVNL 
EG1VWHETEEGVLWNVTWRNKTYVGTLLD 
CTKHDW APPRFCESPTSDLEMRG GRGRGKR 
ARSAAAAPQSEASFTESRGLQNKNRGGANGK 
GRRGSLNASGRRTPPNCAAEDIKASPSSTNKR 
KNKPPMELDLNS S SEDNKPGKRVRTNSRSTP 
TTPQGKPETTFLDQGCSSPVLIDCPHPNCNKK 
wuTwrr dvuhauaui npcxifi Fppprwpritr 

I ISJlllNVjLJs. I HyAiTA^LlJr t INFvljt J" trJJoxilJr^ 

ISDCEEGLSNVALECSEPSTSVSAYDQLKAPA 
SPGAGNPPGTPKGKRELMSNGPGSIIGAKAGK 

DOST AAFMPK1 FAFG1 IDKKNI GD1CEKGKK 

ANNCKTDKN\PSKLKSARP1APAPAPTPPQLIA 

IPTAIFmi-l'GTIPGLPSLTTTVVQATPKSPPL 

KP1QPKPTIMGEPITVNPALVSLKDKKKKEKR 

KLKDKEGKETGSPKMDAKLGKLEDSKGASK 

DLPOHFLKDHLNKNEGLANGLSESQESRMAS 

DCAEADKVYTFTDNAPSPSIGS 


1139 


2489 


A 


9234 


207 


443 


TRRGQPWRRRAAAAGILPGREAAACLPSC/AS 

VTAAVSGLLVGYELGIISGALLQIKTLLALSC 

HEQEMGVSSLVIGALL 


1 140 


2490 


a. 


9238 


248 


328 


MAOGNNYGOTSNG V ADESPNMLVYRKV 


1141 


2491 


A 


9242 


2 


535 


FVEAAVKMLGSLVLRRKALAPRLLLRLLRSP 

TLRGHGGASGRNVTTGSLGEPQWLRVATGG 

RPGTSPALFSGRGAATGGRQGGRFDTKCLAA 

ATWGRLPGPEETLPGQDSWNGVPSRAGLGNft 

WPWAAALWHCYSKSPSNKDAALLEAARAQ 

\NMQEVSRNRCALLHSAAVQEYGYGN 


1142 


2492 


A 


9245 


157 


466 


HLCFWFFVGLFLPEQQIMLFATLLRMAQGCD 
FALGNDFLNITTKAQA/TKEKLDKLDFIKIKTC 
CTSMDAIEKTEPLTKWTKAFVSHVSYKRLLF 
GICKEYSRQ 


1143 


2493 


A 


9247 


264 


115 


GLPQQTSTIQPPGTPDGARDFTSTIQPPGAPDG 
ARDSTSIIRMGPEIPPP 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, '^possible 
nucleotide insertion 


1144 


2494 


A 


9260 


1 


401 


KKVPGRLSEMSFSLNFTLPANTTSSPVTADCGP 
SLGLAAGIPLLVATALLVALLFTLIHRRRSSIE 
AMEESDRPCEISEIDDNPKISENPRRSPTHEKN 
TMGAQEAHTYVKTVAGSEEPVHDRYRPTIEM 
ERRR 


1145 


2495 


A 


9264 


175 


411 


METIWIYQFRLIEIGDSTVGKSCLLHRFTQGRF 
PGLRSPACDPTVGVDFFSRJLLEIEPGKR1KLLL 
WDTAGQERF1SIT 


1146 


2496 


A 


9277 


592 


814 


MFTYLEGREGnCSQPKMEPHSVTVRLECSGMI 

SAHCSLNLPGTSDSPASASRWAGTTGMRHHA 

WLIFAFLVETGF 


H47 


2497 


A 


9279 


1255 


2 


FRRGRRGEEEKEEEEEEEEG W VN G MENSHPP 

HHHHQQPPPQPGPSGERRNHHWRSYKLMIDP 

ALKKGHHKLYRYDGQHFSLAMSSNRPVEIVE 

DPRVVGIWTKNKEXELSVPKFXIDEFYVDQV 

PPKQ VTF AXLNDN IRENFLRDMCKK YGEVEE 

VEILYNPKTKKHLGIAKVVFATVRGAKDAVQ 

HLHSTSVMGNTIHVELDTKGETRMRFYELVLV 

TGRYTPQTLPVGELDAVSPIVNETLQLSDALK 

RLKDGGLSAGCGSGSSSVTPNSGGTPFSQDTA 

YSSCRLDTPNSYG/QGTPLTPRLGTPFSQDSSY 

SSRQFI"PSYLFSQDPAVTFKARRHESKFTDAY 

NRRHEHHYVHNSPAVTAVAGATAAFRGSSD 

LPFGTVGGTGGSSGPPFKAQPQDSATFAHTPP 

PAQATPAPGFR 


1148 


2498 


A 


9302 


1026 


g 


IASIONADTMPGVGLLVSHFSTLVSRORCPNY 

ADPQNLTDVSIFLLLEVSGDPELQPVLAGLFL 

SMCLVTVLGNLLIILAISPDSHLHTPMYFFFSN 

LSLPDV\GFTSTTVPK\MIVDI\QSRSRVISYAG 

CLTQKSLFAIFG<jTEE\NMLLSVMAYDRFYAI 

CHPLYHSAIMNPCFCAFLVLLSFFFLSLLDSQL 

HSWIVLQFTIIKNVEISNFVCDPSQLLKFACSD 

SIINSIFIYFHfCDPERQLVLAGLFLSMCLVTVL 

GNLII1LD V SPDSHLPTPMYFFLSNLS LPDIGFT 

STTVPKMIVDIQSHGRV1FYAGCLTQMSLFAIF 

GGMEERHAPECDGL 


1149 


2499 


A 


9303 


1 


699 


MASQEKDIFIGWGTIHLFRKPQRSFFGKLLRE 

FRLVAADRSMGRYMLFGVINLICTGFLLMWC 

SS-1-NSIAL'I\SY-I*YLTIFDLFSLMTCLISYWVTL 

RKPSPVYSFGFERJLEVLAVFASTVLAQLGALF 

ILKESAERFLEQPETHTGRLL VGTFVALCFNLF 

TMLSIRNKPFAYVSEAASTSWLQEHVADLSR 

SLCGIIPGLSSIFLPRMNPFVLIDLAGAFALCIT 

YMLIEI 


1150 


2500 


A 


9308 


797 


693 


DRSTSVTRAGVQWCSLGSLQPRTPGLLRSSCL 
SLP 


1151 


2501 


A 


9309 


205 


406 


vaikelpvlwkwskptrmakeppqtqqrag 
sktaappcqwsrmasegpnipcpgarhsdkq 
fucti 


1152 


2502 


A 


9314 


913 


504 


KPSPLITPPAVVLPPSAVLNEVNTFSSFPQVEV 
QGPLCGPRKGRLAVTIPFFGLS/LPKYMDHRR 
PPPHR\EIFFVFLAETGFHRASQAGPDLPTS/S/I 
PPTSA/FPKCWEYRSEPQCLPGCLSFSGILLDL 
GTNVSLRAA 


1153 


2503 


A 


9315 


392 


1 


HPHRPRPGFRSPARSSRPCPVLTSLLPPFPSPSP 
PADDLVKAGRDRKDPQVR/ERRLRPNPGRLG 
GPR\PRPARARS/CHQPRLTRVCPRSPPPEARA 
PAP AAP ARGRG APKRNRPRTDTRAPRGS S AR 
PGNS 
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Y=Tyrosine, X= Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
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1154 


2504 


A 


9321 


331 


433 


MPCI/QAQYGTPAPSPGPRDHSASDPLTPEFIK 
PT 


1155 


2505 


A 


9324 


180 


275 


MEEPQSDPSVEPPLSQETFSDLWKLLSENNVL 


1156 


2506 


A 


9326 


383 


619 


MISPSRTEGDPLPLPP/EGEGQEVRGFGGGPAK 

EAAQRHCRASVSILRMRRPGQGSSRPARVPL 

RGPDSHRLREPPPSPP 


1157 


2507 


A 


9327 


152 


292 


YERRGRSQGGGSHPAGAQPGGRAIGAGWQS 
KEPLWEGLQRSGSFLPG 


1158 


2508 


A 


9328 


1 


430 


QELKQGPNPLAPSPSAPSTSAGLGDCNHRVD 
LSK.TFSVSSALAMLQERRCXYWLTDSRCFL 
VCMCFLTFIQALMVSGYLSSVITTIERRYSLKS 
S ESGLL VSCFDIGNL V VVVFVS YFRGRRRRP/ 
RVAAVGGLLDLEGGEM1 


1159 


2509 


A 


9334 


108 


383 


KGNQVNGNGNQLKRKHESMCPVSLTQNTVR 
LMEAGLPQKQAERADELFEAGLVIYVKLDER 
VLN ALXYSSVGLQ WFKESDL SHLRLLEI SFR 


1160 


2510 


A 


9338 


2 


430 


FVGRPRGLSDRLEDLFLAGFRVGERLRTAAM 
KRYVRILLLGEGAEHVADPVPGGRGVPRGEA 
DHTDQELREEIHKANVERWHDVSQEATTEKI 
RTKWIPLV/RWGDHA/EGPVGIKSYLPSGRSM 
EAELPIMSQLTEIETCVEC 


1161 


2511 


A 


9341 


1 


390 


NSRVDDFVAPGLSEAGKLLGLEFPERQRLAA 
A VG/CSPMSG VI SM S APFFLGKI ID AI YTNPT V 
DYSDNLTRLCLGLSGVFLCGAAANAIRVYI-M 

GELI 


1162 


2512 


A 


9343 


84 


837 


QGRFRAFCWQRDFLQPPGMRLSALLALASKV 
TLPPHYRYGMSPPGSVADKRKNPPWIRRRPV 

WQVIRQRNWWVGGLNTHYRYIGKTMDYR 

GTMIPSEAPLLHRQVKLVDPMDRKPTEIEWR 

FTEAGERVRVSTRSGRUPKPEFPRADGIVPET 

WIDGPKDTSVEDALERTYVPCLKTLQEEVME 

AMGIKETR\NTRRSIGIEPGAEQLLPNFCPSLE 

G 


1163 


2513 


A 


9346 


967 


616 


DSLALSPRLECSGAISAHCNLTPPGFTPFSCLS 
LPSS WA YRC ASPHPDNFF VFL VESGFHHVGQ 
AGLKLLISSDPPTSA/FPKCWDYRRDSSSAPAT 
FSSYQRNNPDLILNDTIMFNTK 


1164 


2514 


A 


9347 


3 


1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTI 
HGGWRHHRDf 1TAIDE WDFNPSKFLIY TCLLL 
FS VLLPLRLDG1 IQWS YWA VF API WLWKLL V 
VAGASVGAGVWARNPRYRTEGEACVEFKA 
MLIAVGIHLLLLMFEVLVCDRVERGTI IF WLL 
VFMPLFFVSP VS VAACVW GFRHDRSLELEILC 
S VNILQFTFIALKLDRIIH WPWL WFVPL WILM 
SFLCL WL YYTVW SLLFLRSLD V V AEQRRTH 
VTMAI S W1TIWPLLTFEVLL VHRLDGHNTFS 
YVSIFVPLWLSLLTLMATTFRRKGGNHWWF 
A1RRDF/CQDQLPQPTGKPPPPPLTDHHGEKA 
LPLQNKDRGSWPASRGSPRLL 


H65 


2515 


A 


9362 


547 


991 


DVSIGPPLLRRFCSGREQTRSLSFPSDPESSFSP 

VPEGVRLADGPGHCKGRVEVKHQNQWYTV 

CQTGWSLRAAKVVCRQLRCGRAVLT\QKRC 

TKHAYGRKPIWLSQMACSGPEPTLHDCPFRP 

LGEDTLFHVEYTSVHGRERLSAKD 


1166 


2516 


A 


9363 


201 


387 


PPILRWTPPSGKNFFFFFFFESEFY/SSPRVECS 
GA1SAHLAHCNLCLPGSSDSPASAFQVAS 


1167 


2517 


A 


9368 


707 


1087 


AVLTTCLSPCSPSRIPRttSRPYPGRRSLSHTPP 
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PRPULYAPAPNRPAGTAFIPHSHPPPPDLLRFT 
ATPA/TPCPSLPPPPRPLHPTQPSTALLPDPPPW 
PLPFPPPSS/RPPRPDC STS YSPTFPPPT 


1168 


2518 


A 


9375 


511 


15 


MMLSEETSAVRPQKQTRFNGAKLVWMLKGS 
PITVT S A VII VLML LMM/IF S P WL ATHDPN AID 
LTARLLPPSAAHWFGTDEVGRDLFSRVLVGS 
QQSILAGLVWATTGMIGSPLECLFGELGGRA 
D AIFMR VMD I MRS/I PS L VLTMEKT AAL GPSL 
FNAMQASSEH 


1169 


2519 


A 


9377 


42 


410 


GNGRVAPRDPGAVASAEPGLTTHDSGVNPN 
NS ARRME AMAS G SN WLSG VNWLVMAYW S 
LVFVLLFIFAKRQIMRFAMKSLRGPHGPVGH 
N'APKDLKEEIDILLSRVHNIKYEP\HLLADDDA 




ZjZKJ 


A 

A 


one 


inn 


1 jUj 


nvQncc acvi ont) u K/f eric i ppcr DDDTnin^.B 

ILLLTICAAGIGGTFQFGYNLSIINAPTLHIQEF 

TNETWQARTGEPLPDHLVLLMWSLIVSLYPL 

GGLFGALLAGPLAITLGRKKSLLWNN1FWS 

AAILFGFSRKAGSFEMIMLGRLASWGVNAGV 

QKyTMTf^PVKyfT PHfiPSAPVPI Rr.iVAUQQA TTTT A 

LGIVMGQWGLSTTAATGLRGLVAGELEELEE 

ERAACQGCRARRPWELFQHRALRRQVTSLV 

VLGSAMELCGNDSVYAYASSVFRKAGVPEA 

tfTHV AHCVTY^PFT I TAVV3V*!! PdAI PPPAT 

WGGTPRSFALNQFTLQKKKK 


1171 


2521 


A 


9381 


2 


412 


RGPASAQEDERARTAPLERVRARGRMTTSSA 
LFPSLLPC S WSTSNK. YL AEFRAGICM S LKGTTE 
TPDK'RfCGT AY/TOOTDD9T Tl-TFPWlCriRT'sfVNV 

EDDLIIFPDDCEFKRLPQCPNGRVYVLKFKAG 
SKRLFFWMQEP 


1172 


2522 


A 


93 84 


20 


355 


O WNGRSTE A SPAAE APHVPHKETKAAMGTQ 
rTHG CiK VR PnPRDMT TTVVHKIKI FVT THSI 
LQLCAIM1SDYLKSSIYTVEKRLGLFRPTSGLL 
ASFNEVGNTALIVLESY 


1173 


2523 


A 


9393 


430 


87 


LCQCI VPGQQKETFSLNPSS AT VRF YL* LSLQ 
ORKFno*rn *yht mkdct hifmsajti ymkt* 
KIF VLFDFNIMFETPFYI I *FIFLFSQNLKR1RQV 
rRPPISFSKINNGP 


1174 


2524 


A 


9397 


77 


374 


ERLEIGRLGGERGSGPASCLRVIDVSGMWDQ 
RLVXLALLQLLRAFYG1KVKGVRVHRDCGTF 
ESSSTLIRVS*FGVPCNALAHFGVTHF*YILDF 
LGML 


1175 


2525 


A 


9399 


66 


397 


HESSRADRDKMDTRGSTYTOADPVNKSGGT 
AKMNXWSKGKVRDKLNNLVLFDTATYDKL 
CKEVPNYKLITLAVVSERLKJPGSLARAALIIE 
LLSRGLI*LVIQHIAQVIY 


1176 


2526 


A 


940S 


2 


299 


LDLTHVLSLSI SLTVTTLGTTFGM VIPLLD WY 
GERGYAQNGDF*DAQLDDYSFSCYSHAQVN 
GAPNSLTRA YDDP* VKJSGLECQKVGALVEV 
KCLNL 


1177 


2527 


A 


9416 


2 


402 


CNFLRSSR1RVHSTPAASTMPPKVDPNEIKVV 

YLRCTGGEVRATSALAPKIGPLGLSSIKVGVD 
FVATGDWNVLUSVILT1RILLSH1FWPPFFCF 
DHLIAFWDLQSLIFLHVIFSLFITLLLFCFFSIF 


1178 


2528 


A 


9419 


142 


426 


TPLFDLWPRWLSWLETVLTSLRTRRAASGPP 

ACRIMPTTVDDVLEHGGEVHFLQKQMLYLL 

AL[*DTFAPIYVGIVFLGFTPDHRCRSPGVAEL 


1179 


2529 


A 


9420 


1450 


1655 


LSSAGT1CMNLN*KNYWPGASAHACNPSTLG 
GQSRCITRSGDRDHPG* HGETPS VLKIQKI SRA 
WWRAP 
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D=Aspartic Acid, E=Glutamic Acid, 
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1180 


2530 


A 


9422 


176 


375 


HRPQTTRPDWKPRT*PQGK*GRLSSEISPASPP 
SRFSRSTKPVPPKADPPARQKLTGVLHAPLLK 
L 


1 1 si 

1 lol 


£.JJ 1 


A 


y^t jo 






PTAASI RMYNf OPVTFFNT TCTAFATMVFTVP 
IARTTLDRLTGIPHGYCFVE*ADWATADKCVH 
IYNGKPLPGATPLLSLQLHQLAHLGS 


1 1 ffO 


ZJ3Z 


A 




1 

J 




VFlKPSSKSTVT ^FYPPHrMPSl STDPKPFfiriT 

SMILK*MGAGDEKISAMGKARVDHRELYLGL 

LYPTEDYKXTFRARH 


1183 


2533 


A 


9444 


384 


3 


LKDFQPWALHDWPLFCCCTFLLFLVLECFTR 
ivij^ovj w f\r w LoLyv^nr vjxvrrv w /vurijjxvr>vj v 
RDQPGQYSKTTFLPKIQKLAGHSGAHL* S*LL 
RRMRWKNR1 NPOORSrSFPRWHHPTPCrWAT 

ERG 


1184 


2534 


A 


9462 


391 


655 


LSGFKSLMPKIPLQYI YVRVRTTW SFCLPLDG 
RJCLMLS*YSK*LT*KYNILPEYSRMTLPPGMV 
IHTCNPSTLGGRAGWIV*AQEFET 


1 185 


2533 


A 




. 

21) 




WWWGWEC W VRALKLS SGPAGPL ACWV AK 
KKSLSLSGPVYPSEKGAGLYVF*DRVSLCHPG 

WSAV Vyr W L 1 AAbN 3L.roLLioWUl IvL A 


1186 


2536 


A 


9468 


275 


452 


HIPQLHTKTHYVPTRMVNKI*QIDNSKPWQR 
GG *TGILTHC W * ESKLVQ PL WKIVWHYQ 


1 187 


2537 


A 


94 &y 


TOO 




bVArUroyiLrKKV i jJOUlJKj^^rSLrOr'Klji'^ 

SSRGAEPCLSNCIHSPAPRKQRMGDSDQ*STP 
NPASPHPEAPQEPWDSASGSVGSFSLGRGAK 
ASS* VPGKGRGPRQGSELLAETILELFL ALAN 


1188 


2538 


A 


9471 


124 


397 


TMDKKNRHGNSLDMASEIHNTTGPMCLIENTT 
GRLMANPEALKILSAITQPMVEEAIAGLYRAC 
*FYLrNNLAGMKKGLCLGSTEQAHTIGI 


1189 


2539 


A 


9480 


584 


769 


GHVQSQHFGRPRRADHLRSGDRDHPG* HDET 

pCT T IfTflVlQW A W WR A PVVPATWPAP APFW 
r oLLrvll^Mo W /\ W VV I\JVT V V r/i 1 W L_*M Ti/V ClC \v 

R 


1190 


2540 


A 


9483 


463 


86 


VTVGLTLLLRGAPRFTAG*PPSGGGPPLAPLL 

GASSCRRIUICNPVLAARKAGSPRSHSTRENC 
RRSRfTnTAHRRRRRGRRRNPSCVRSPRWR 


1191 


2541 


A 


9489 


1 


411 


L ADALCLS AAATGA VRPG ARAQPSTRRRLS P 
SVRVCCRAAAASNL L Y SSCLQRHSERASEEG 
ERG SLS AKCCSLVLRGGC SSSNSHSFRRIT*EI 
MAAFVLLSYEQRPLKRPRLGPPDVYPPDPKQ 
KEEELTAVNVK 


1192 


2542 


A 


9497 


389 


161 


VSFLSMSSGHCIRSTRGSKMVSWSV1AK1QEI* 

CEEDERKMARJiFLAEFMSTYVMMNlHMTVE 

KDTYSDHEEINTS 


1193 


2543 


A 


9509 


186 


1 


IAKSQ*KRWQRSGAMETLKHQWWECKLVQF 
FGKTFVNVN+ S*TYVYPCDKIILLLGL YPTEM 


1 194 


2544 


A 


9512 


58 


433 


PLQRSKCLTLRCLRAKPWAWSQSPRACSSAL 
LKSSRSRASSLNVQCILQSNPQGHQR1*KQKA 
SSKGQQFRR*1CEHPFMLKTLNKLRIEGT*LKI 
RRAIYDNPTAN1IVEGQKLEAFPLRTOTRQ 


1195 


2545 


A 


9515 


595 


1223 


GHGAPSFQTQVPRTP'ASWPVVPAASESAPAP 

AGGGASLPVAAGSCAAAPHTEPGAPQHLLDC 

PCPLCLARPPRRPLPDTCYGPGSGRSASLAEPP 

LPRCSCAPLRSASAPQVS*CV*AVNLLPIINL* 

PLHLLLHD*EKAWGFLFSSASHCFQGQICLLP 

APGSGPCGATARPSRGGRAGGSRARRPIPPGP 

GTRRTPSGCQNPAASGG 
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1 196 


2546 


A 


9518 


229 


468 


RSPTATPAPHAMGPGAPFARGGRPLPLLGAM 
AERVAPGWDLHTPYLPRTN5RRTPHL**EPHA 
G YIG ALFPMS GG WPG GQ 


1197 


2547 


A 


9521 


289 


448 


1AWLSGLFFPSNQANLCFLCYKLTADSRYRG 

HAMRHLTGNTSMAIRFL*ADSRFQVQRARYE 

APNWKYKYGY*IPVDMLC 


1198 


2548 


A 


9524 


204 


1 


KNKKTTKCLSIVTLNISGPNQ«NKRHRVAEWI 
VKQEPNICHL* ETHFPFRDTYRLKEREQKKRK 
SSYS 


1 199 


2549 


j\ 


9546 


1785 


1943 


GGRFKESKLTNAGWORNSFFIGPPKS1PWAA 
V* QRGDGKNPGVTHLNRPVGTX 


1200 


2550 


A 


9548 


186 


1 


VNAEKEF*KIQHYFMTKSQNKLH1EHTYLKPI 
KA1 YDKWTSDIMLNLQKL* AFFLRVI VRQI 


1201 


2551 


A 


9549 


591 


2 


SSVVEFPRGPRSSLPPLDSTFPCGSSPNWTGGC 

YHVQHLATFIMDKSEAJTS VDD AIRKL VQLSS 
KEKIWTQEMLLQVNDQSLRLLDIESQEELEDF 

PT PTVORSOTVT MOT RYPSV1 T T VPOD^FO'sTC 
PD VHFFHCDE VEAEL VHE Y ME S ALTDCRLGK 
AMRP 


1202 


2552 


A 


9552 


428 


1 


KYGNEGHWSRQCPNPGKPIRPCPLCRGPHWK. 

T nfFRPPOTlPT P^T PPT AVT'sV < 5T"YI TflT ATPri 

* WGPGMD AP ATTI ASSKTRVTLMV AGRPVFF 
LI*YRATYSALPNFSGPTQSSQVSWGIDGQV 
SKPRATPPLFCSLHTF 






A 
r\ 


7JOO 


S 1 7 


J JO 


EKEKRRREKGEREERKMRHRERKGESGQRD 
TMENWRVERLTEKER 


1204 


2554 


A 


9573 


83 


415 


EDKRLPX VDGDSRC AGR V* I YHDGF WGTICD 
DGWDLSDAHVVCQKLGCGVAFNATVSAHFG 

HDCRHKEDAGVICSEFTALR 


1205 


2555 


A 


9577 


64 


424 


ARGSCPTRPRTANGRMGETKDAPQMLVTFK 
DVA VTFFREEWRQL VL VHRTL YR* GMLETC 
GLLDTUUINVPQPDVVHLLYHGTQLL1V1CRE 
VSHSPCAGDMRELFTREATLTPHPYNNGA 


1206 


2556 


A 


9584 


38 


476 


TLGAVLFSEVSKESSTSHSGGQLGRQNRHPKL 
SNFTTPSSPRT KP+TASSORTJI GOTT NMFT TA V 
NPQPLSTPSWQIETKYSTKVLTGNWMEERRK 
GLPYKHLITHHQEPPHRYLISTYDDHYNRHG 
YNPGLPPLRTWNGQKLLWL 


1207 


2557 


A 


9586 


2 


412 


L RS SP AAL LRAL C rTTVTGT ALAL RSRV ATTN 
PDGCRNVLRPKYYRLCX)KA£SWGIALETVPT 
GVAVTSWAIMLTVLTLVCKGQDYNRRQKLP 
TH1LCLL*EKGIFGLTFAF11GLDGSTGPTRFFL 
FGJXFSICFS 


1208 


2558 


A 


9597 


122 


3 


KNYWPGMVAHACNPSPLGGRGRW1A*AQK 
F ADA WAD AW 


1209 


2559 


A 


9611 


148 


558 


KSLRN V WDLLNNTWK ADRFFCHSS RTSTTRK 
GDPGPTFSKMSIWTSGRTSSSYRHDEKRNIYQ 
RIRDHDLLDKRKTVTALKAGEDRAILLGLAM 
MVCSIMM* FLLGITLLRS YMQ S VWTRESQCT 
LLNAS1TETFNC 


1210 


2560 


A 


9618 


384 


2 


SLHDMLMLAEQQQKQKWAVNTQNTAWSNA 
DSKFGQRILEKMEW SKGRGLG VQEQGGPDDI 
KVQVKN>TOLGLQATINNEANWIAHQDDFNW 
LLAELNTCQRQETADS***WSPKNSHVGKDS 
GELSAK 


1211 


2561 


A 


9620 


316 


610 


QKHPGGGQLGRSPQEDSRFHNKASSGVSRVR 
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/=possible nucleotide deletion, \=possible 
nucleotide insertion 














LGRAW WLTP VIFTL WEAKAGG SPE* D* AGRG 
GSJRL* SQHFGRPRRVDHLRS A VQDQPGQHGE 
TFSLLKIQKIN*VWGRRL*SSYSEAEAGESL 


1212 


2562 


A 


9623 


297 


344 


QFPVDGDYQKIEKITQLFQAQNLSLCLAMTR 
TREL* K.GGGKGRHE* AWPFLKKGGYGVKAP 
AJLNTSNCT* CF*ETKMLSDDPKACVFE VSSA 
DL*NTSFGVIR 


1213 


2563 


A 


9624 


2 


356 


AELSLASTACGRNTSODSLPDYDRAPISSPLA 
TSGT1LSAISCLWDLPTPVLRVGLSCQPSMSSQ 
IPRMYSTDVEAAVNSLEDLYLQAYYAYLCVG 
L YFHRDDMALEGVSRFL* ELAE 


1214 


2564 


A 


9634 


776 


912 


SLSRWVRAKL*VPYNQENCLNPRGGGCSEPR 
SHYCTPAWATEKDS 


1215 


2565 


A 


9636 


220 


426 


KFGNFA VSSEY* DITSGQLKTA VRG * IEMTST 
EENFGEKLHDIGFGNGFLDKT* KAQATKAKI 
DK 


1216 


2566 


A 


9637 


391 


76 


CFL EDGCTQA S * AEEAA VSPSMAEEEQGSTS C 
RE R RS IRFKM KNHS PDDTI KENVTl SN IRTRKI 
NHLPETERNLLEHGLMYIRLNAAFCSLVAHS 
LFGFILKAT 


1217 


2567 


A 


9655 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKV 

EEHHLQPVQVLQTLLHSATAGTGCRRPARPP 

PAPPTPTPWRSRQSGKQSERAS»LKGRGRYGL 

GALGGRGORALGGSRWPPPLPGETLFSGCKH 

RRRRRGSDAAPGEEAGT 


1218 


2568 


A 


9658 


3 


405 


HASARALLSPNLSPNNKMA1SGGPVLGFFIIA 
VLMSAQEPWATKEEHV1IQAEFYLNPDQSGEF 
MLDFE GEDTFHGDMAKKET V WRLE *L ARLD 
NFEAQRALANIAADQAALE1MDMGSDYTL1P 
NVPPKVTVL 


1219 


2569 


A 


9662 


3 


284 


PDWTEKRKMQDTG SILPLH WFGFGY AAL V A 
YGGI1GYVKAGSVPSLAAGLLFGSLSGLGAYQ 
LSQDPRNVWVFLATSGTLAGIMGMRFYHSG 
KL 


1220 


2570 


A 


9669 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSA 

PGNTSI .CTRDYKITQVLFPLLYTVLEFVGLITN 

G1.AMRJFFQ1RSKSNF1IFLKNTVISDLLMILTF 

PFKrLSDAKLGTGPLRTWCQVTSVTJFYFTMYI 

SISFLGLITTDRYQKTTRPFXTSNPKNLLGAKIL 

K 


1221 


2571 


A 


9676 


164 


562 


KFRDSSrFSAAMTTMQGMEQAMPGAGPGVP 
QLGNMAVIHSHLWKGLQEKFLKGEPKVLGV 
VQILTALMSLSMGrrMMCMASKTYGSNPISV 
YTGYTIWGSVMFUSGSLSIAAGIRTTKGLVRG 
SLGMNITSS 


1222 


2572 


A 


9688 


43 


412 


V AKMVKCC S AI GC ASRCLPNSKLK GLTFHVF 
PTDENIKRK\VVLAMKRLDVNAAG1\VHPKKG 
DVLCSRHFKKTDFDRSAPNIKLKPGVIPSIFDS 
PYHLQGKREKLHCRKNFTLKTVPATNYNH 


1223 


2573 


A 


9696 


308 


564 


RTSMGILYSEPICQAAYQNDFGQVWRWVKE 

DSSYANVQDGFNGDTPLICACRRGHVRTVSFL 

LKKECLCQPQKPERENLLALCCE 


1224 


2574 


A 


9700 


3 


632 


DAWASGGELGSLFDHHVQRAVCDTRAKYRE 

GRRPRAVKVYTINEESQYLUQGVPAVGVMK 

ELVERFALYGAIEQYNALDEYPAJEDFTEVYLI 

KFMNLQSARTAKRKMDEQSFFGGLLHVCYA 

PEFETVEETPvKXLQMRKAYVVKITENKDHY 

VTKKKL VTEHKDTEDFRQDFHSEMSGFCKA 

ALNTSAGNSNPYLPYSCELPLCYFSSK 
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D=Aspartic Acid, E=<jlutamit; Acid, 
F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, 
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T=Threonine, V— Valine, W«=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possiblc 
nucleotide insertion 


1225 


2575 


A 


9710 


1 


163 


RSGCVLRMTEWETGAPAVAETPDIKLFGKWS 
TDD VHINDI SLQD YI AG VRLILL 


1226 


2576 


A 


9713 


82 


492 


QGLPSFLPAFGPSGSWLGPAPTLGSSCNTVDT 

ICHGYSEIRPLFYLSFCDLLLGLCWLTETLLYG 

ASVANKDIICYNLQAVGQIFYISSFLYTVNYI 

WYLYTELRMKHTQSGQSTSPLVIDYTCRVCQ 

MAFVFSSL1 


1227 


2577 


A 


9720 


3 


416 


GKWKRTQVPLLGEECADMDLARKEFLRGNG 

LAAGKMNISIDLDTNYAELVLNVGRVTLGEN 

NRKKMKDCQLRKQQNENVSRAVCALLNSGG 

GV1KAEVENKGYSYKKDGIGLDLENSFSNML 

PFVPNFLDFMQNGNYF 


1228 


2578 


A 


9723 


278 


41 1 


EAS S SNTVA SN VADKTDPHSMNSRVFIGNLN 
TLVLQK.SDVEAVF 


1229 


2579 


A 


9725 


121 


902 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDY 

GGSGGPYSKQYAGYDYSQQGRFVPPDMMQP 

QQPYTGQIYQPTQAYTPASPQPFYGNNFEDEP 

PLLEELGrNFDHIWQKTLTVLHPLKVADGSIM 

NETDLAGPMVFCLAFGATLLLAGKIQFGYVY 

GISAIGCLGMFCLLKLMSMTGVSFGCVASVL 

GYCLLPMlLLSSFAVIFSLQGMVGIILTAGnG 

WCSFSASK1FISALAMEGQQLLVAYPCALLYG 

VFALISVF 


1230 


2580 


A 


9739 


11 


247 


TFVLNN4NTPKEEFQDWPIVRJAAHLPDLIVYG 
HFSPERPFMDYFDGVT -MFVDISGKCKRDVCI. 
MWMSNRLAWEFTCRA 


1231 


2581 


A 


9744 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPP 

ACR1MPTTVDDVLEHGGEFHFFQKQMFFLLA 

LLSATFAPIYVGIVFLGFTPDHRCRSPGVAELS 

LRCGWSPAEELNYTVPGPGPAGEASPRQCRR 

YEVDWNQSTFDCVDPLASLDTNRSRLPLGPC 

RDG W VYETPGSSIVTEFNLVCAN S WMLDLFQ 

S S VNVGFFIG SMSIG YLADRFGRKLCLLTTVLI 

N AAAG VLMAI SPTYT WMLIFRLIQGLVSKAG 

WL1GYTLITEFVGRRYRRTVGIFYQVAYTVGL 

LVLAGVAYALPHWRWLQFTVALPNFFFLLY 

YWCTPESPRWLISQNKNAEAMRIIKHIAKKNG 

JCSLPASL 


1232 


2582 


A 


9753 


164 


517 


PGPGMQGPPPITPTSWSLPPWRAYVAAAVLC 
YrNn^LNYMNWFriAGVLLDIQEVFQISDNHAG 
LLQTVFVSCLLLSAPVFGYLGDRHSRKATMS 
FGILLWSGAGLSSSFISPRYSWLF 


1233 


2583 


A 


9757 


25 


419 


LPAPWTERVRKSEGLVGTCLGDPMASPRTVT 
IVALSVALGLFFVFMGTIKLTPRLSKDAYSEM 
KRAYKSYVRALPLLKKMGrNSILLRKSIGALE 
VACGrVMTLVPGRPFCDVANFFLLLLVLAVLF 
FHQLV 


1234 


2584 


A 


9765 


71 


456 


RLELDWGFSLHFLPVAYLCPLSSGFEMNVQP 

CSRCGYGVYPAEKISCIDQIWHKACFHCEVC 

KMMLSV^T^FVSHQKKPYCHAHNPKNNTFTS 

VYHTPLNLNVRTFPEAISGIHDQEDGEQCKSV 

FHWD 


1235 


2585 


A 


9767 


52 


559 


IRJSGAMSVDKAELCGSLLTWLQTFHVPSPCA 

SPQDLSSGLAVAYVLNQIDPSWFNEAWLQGI 

SEDPGPNWKLKVTSGLL1RGQTGEEMTRDGP 

ARHMSWVMGRKRDRCLVINHLFrHSSMEYSP 

C ARPG H S ARKNTOKNLPHTA riL VT SNTYTTI 

KiNFQAGRSGSCL 


1236 


2586 


A 


9770 


352 


608 


FRGE ALTVRFLTKRFIGEYASNFES I YKKHLC 
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LERKQLNLEIYDPCSQTQKAKFSLTSELHWA 
DGFV1VYDISDRSSFAFAKAL1 


1237 


2587 


A 


9793 


266 


515 


NILAHYFPFPRLFLLRDSQSNPKAFALTLCHH 
QKIKNFQ ELP V S I D ALTPPL W CFL V S FLTHFS 
RYKPTRPVCITQFQGCS 


1238 


2588 


A 


9802 


537 


967 


ELGAGRSDREAMEAAVKEEISVEDEAVDKNI 

FRDCNKIAFYRRQKQV/XSKKSTYRALLDSVT 

TDEDSTRFQIINEASKVPLLAEIYGIEGNIFRLK. 

rNEETPLKPRFEVPDVLTSKPSTVRLISCSGDT 

GSLILAIXjICGDLKC 


1239 


2589 


A 


9805 


105 


540 


VPGDPAMVRAGAVGAHLPASGLDIFGDLKK 
MNKRQLYYQVLNFAMIVSSALMIWKGLIVLT 
G9F1PTVWT SG^MFPAFHRGDT I FT TNTFRFr* 
P1RAGEIWFKVEGRDIPIVHRVIKVHEKDNG 
DIKFLTKGDNNEGUDRGSYK. 


1240 


2590 


A 


9819 


3 


305 


TDGRDPLPCAARRRGGGGECCGAGWVAEWS 
PQPLDPAMLLWMQGFVLEAVACQDNDDYLR 
YGELFEDLDCNG DG WDI IELQEGLRN WS S AF 
DPNSEEHG 






A 

A 


9834 


841 




LDSSTHSSSTATQSRAKMNTPAPTPSTVPAIPR 
GGSGGPPPCAPHDRVSSVLQCDTQAMDHKTE 

t;<5H<!VVPPl FVRTVTP9PFT-TPA VRFTsTR'NI 


1242 


2592 


A 


9843 


3 


589 


TISCGPATEPPASLLSSASSDDFCKEK.TEDRYS 
LGSSLDSGMRTPLCRICFQGPEQGELLSPCRC 
DGS VKCTHQPCLIKW ISERGC W SCELC YYKY 
HVIAISTKNPLQWQAISLTVIEKVQVAAAILGS 
LFL1A SIS WLI WSTFSPS ARWQRQDLLFQIC YG 

[VI I \_J i V IVll V f\ V l^/OdWlVl V V^/A^-\JVJj V vJIVIV YY O 

DIPP 


1243 


2593 


A 


9846 


198 


411 


WRISHHAGKMPVMKGLLAPQNTFLDT1ATRF 
DGTHSNFILANAQVAKGFPIVYCSDGFCELAG 
FARTEVMQ 


1244 


2594 


A 


9848 


116 


650 


PICGFL YLC S AM ASESSPLLA YRLLGEEGV AL 

PANGAGGPGGASARKLSTFLGWVPTVLSMF 

SIWFLRiGFWGHAGLLQALAMLLVAYFILA 

LTVLSVCAIATNGAVQGGGAYCILQHRWTG 

VWPVLPAREVMISRTLGPEVGGSIGLMFYLA 

NVCGCA VS LLGLVE S VLDVFG A 


1245 


2595 


A 


9849 


573 


1620 


KSKCRFPEGLSEGFGPMRKEAL SSG S VQE AE 

SYIGPKRTAWRGJMHREAFNUGRRJVQVAQ 

AMSLTEDVLAAALADHLPEDKWSAEKRRPL 

KSSLGYEITFSLLNPDPKSHDVYWDIEGAVRR 

YVQPFLNALGAAGNFSVDSQILYYAMLGVNP 

RFDSASSSYYLDMHSLPHVrNPVESRLGSSAA 

SL YP VLNFLL YVPEL AHSPL YTQDKDG APVAT 

NAFHSPRWGGIMVYNVDSKTYNASVLPVRV 

EVDMVRVMEVFLAQLRLLFGIAQPQLPPKCL 

LSGPTSEGLMTWELDRLLWARSVENLATATT 

TLTSLA 


1246 


2596 


A 


9850 


114 


464 


PPQLGAQRVREPRHPDVRAPLRVTSPGLRSRS 
ARSLGRRPRIAMVTVGNYCEAEGPVGPAWM 
QDGLSPCFFFTLVPSTRMALGTLALVLALPCK 
RRERPAGADSLSWGAGPRISSYV 


1247 


2597 


A 


9851 


2 


327 


FVRNKKMTRSCSAVGCSTRDTVLSRERGLSF 
HQFPTDTIQR SK W IRA VNR VDPRSKKI WIPGP 
GAILCSKHFQESDFESYGIRRKLKKGAVPSVS 
LYKVFKYSSRCTS 
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1248 


259S 


A 


9853 


58 


444 


R VDDF VYSK.GGKD AGG AD VSL ACRRQ S IPEE 

FRG1TVVELIKKEGSTLGLTISGGTDKDGKPR 

VSNLRPGGLAARSDLLN1GDYIRSVNGIHLTR 

LRHDEITTLLKNVGERVVLEVEYELPPPGGCP 

WT 


1249 


2599 


A 


9856 


2 


^265 


LPPPRPSRHRRGRAGTRASAAAAAGPTVSAV 

RAPVRGQDSGAGTPQGRLAGRGAHLSRVGA 

SGSGVAAGPAARHAPRRRCADAGEAVOASC 

GRCA VALLSG VCTL VSTHVCVG S GCPG AAGT 

PMGAGDAGASAESAVTTAPQEPPARPLQAGS 

G AGPAPGRAMRSTTLLALLAL VLLYL VSGAL 

VFRALEQPHEQQAQREL'GEVREKFLRAHPCV 

SDQELGLLIKEVADALGGGADPETNSTSNSSH 

SAWDLGSAFFFSGTIITTIGGGGDWHVGGGK 

ELPHGGRCRETEGSQVAPRLPASPLCPGYGN 

VALRTDAGRLFCIFYALVGIPLFGn XAGVGD 

RLGSSLRHGIGHIEAIFLKWHVPPELVRVLSA 

MLFLLIGCLLFVLTPTFVFCYMEDWSKLEAIY 

FVrVTLTTVGFGDYVA 


1250 


2600 


A 


9873 


2 


652 


FVVPSPCGG1PGRAPNGASRPTMGNSASRNDF 
EWVYTDQPHTQRRKEELAKYPAIKALMRFDP 
RLKWAVLVLVLVQMLACWLVRGLAWRWLL 
F W A Y AFGGC VNHSLTL AIHDI SHN AAFGTGR 
AARNRWLAVFANLPEGVPYAASFKKYHVDH 
HR YLGGDGLD VD VPTRLEG WFFCTP A RKLL 
WLVLQPFFYSLRPLCVHPKAVTRMEVLNTLV 
QLA 






A 
r\ 




i -sn 




LESYRPDTDLSREDTGCNLQH1SDRENIDDLN 

MEFNPSDHPRASTIFL SKSQTD VREKRK SLFIN 

HHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTI 

KCVALAIYYHIKNRDPDGRMLLDIFDENLHPL 

SKSEVPPDYDKHNPEQKQIYRFVRTLFSAAQL 

TAECAIVTLVYLERLLTYAEIDICPANWKRIV 

LGAILLASKVWDDQAVWNVDYCQILKD1TVE 

DMNELERQFLELLQI?NTNVPSSVYAKYYFDL 

RSLAEANNLSFPLEPLSRERAHKLEAISRLCED 

KYKDLRRSARKRSASADNLTLPRWSPAIIS 


1252 


2602 


A 


9879 


6 


376 


KRPDSRPPAQYRAGPTRPRTRGCELLYWKAT 
KAVGDCMGSLSTANVEFCLDVFKELNSNNIG 
DNIFFSSLSLLYALSMVLLGARGETEEQLEKV 
WNSSEVCSEPRSLSCSRSGSAKLILSLYQ 


1253 


2603 


A 


9880 


180 


388 


KEQAELLYGLYCQCDLTLSSHPSSVPAMSSC 

NFTHATFVLIGIPGLEKAHFWVGFPM-SMYVA 

AMFGNC 


1254 


2604 


A 


9881 


19 


494 


ViSFQIITDTIMDSSTAHSPVFLVFPPEITASEYE 

STELSATTFSTQSPLQKJLFARKMKILGTIQrLF 

GIMTFSFGVIFLFTLLKPYPRFPFIFLSGYPFWG 

SVLFINSGAFLiAVKRKTTETLIILSRIMNFLSA 

LGAIAGHLLTFEFHPRSKLHL 


1255 


2605 


A 


9896 


72 


386 


RPGREQRDCFQAPPLGLGGRQTDMMHHPLT 
GATCVGLPNVGMCPQLSGALTFMYLQQGNQ 
EATVAPDTMAQPYASAQFAPPQNGIPGEYTA 
PHPHPAPEYTGQTT 


1256 


2606 


A 


9902 


95 


399 


SGGPAGLLHRPVLPKMGLSGLLPILVPFILLG 
DIQEPGHAEGILGKPCPKIKVECEVEEIDQCTK 
PRDCPENMKCCPF SRGKKCLDFRK VSLTL YH 
KEELE j 


1257 


2607 


A 


9905 


374 


459 


EHLKSTPNRLGVVAHTCNPSTLGGRGGW | 
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M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S=Serine, 
T=*Threonine, V^Valine, W=Tryptophan, 
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1258 


2608 


A 


9911 


364 


1974 


AGPG VP A VGG RW ASGPGLGGRTLC SGPPDH 

QRRGPSCGASGDPQCVGSPHPQRARPLLARP 

GARLLPGHLPSPRPPRLPTGQPPAAAFRGPVR 

PQGGGHIHPLPTPGGRPCFAVSEGSGSALLLS 

YLGFCGSSSYVTGAACISPVLRCRJEWFEAGLP 

WPYERGFLLHQKIALSRYATALEDTVDTSRL 

FRSRSLREFEEALFCHTKSFPISWDAYWDRND 

PLRDVDEAAVPVLCICSADDPVCGPPDHTLTT 

ELFHSNPYFFLLLSRHGOHCGFLRQEPLPAWS 

HEVILESFRALTEFFRTEERIKGLSRHRASFLG 

GRRRGGALQRKEVSS S SNLEEIFNWKRS YTRL 

MAAAAGAAAAPGSREPQDRPECGAGHPGPR 

Y YRHPERWLLRPE AFLGPLRTRAPS AED SQR 

ERPAARSGPEMRVRYPWAAVLAPYLALSQD 

PMVKSSASGQGASGSYNHVREEMLIKAGGA 

MSRRWRQSKFRHVFGQAAKADQAYEDIRV 

SKVTWDSSFCAVNPKFLAIIVEAGGGGAFIVL 

PLAK 


1259 


2609 


A 


9919 


693 


935 


GCFKFIGESTCCWIFPSSVTTQCVVAKAPRAA 
TLSKAERLRSQPGPEQGGSSYRPRTPTAAA1L 
PPRPGRSHRKRKLVSTK. 


1260 


2610 


A 


9921 


455 


1082 


QRSCLCSAIEKDGGDVKALYRRSQALEKLGR 

LDQAVLDLQRCVSLEPKNKVFQEALRNIGGQ 

IQEKVRYM S STD AKVEQMFQILLDPEEKGTE 

KKQKASQNLVVLAREDAGAEKIFRSNGVQLL 

QRLLDMGETDLMLAALRTLVGICSEHQSRTV 

ATLSILGTRRWSILGVESQAVSLAACHLLQV 

MFDALKEGVKKGFRGKEGAirV 


1261 


2611 


A 


9928 


1 


438 


GFRGAEAPGAAQAPKKKKPRPTEGGPGAGSG 

RGKDPYRGPTLLHQPKPPKDEFLSSLESYEIAF 

PTRVDHNGALLAFSPPPPQRQRRGTGATAES 

RLFYKEASPSTHFLLNLTRSSRLLAGHVSVEY 

WTREGLAWQRADRPHCLYA 


1262 


2612 


A 


9931 


168 


435 


AAEMGRAGAAAVIPGLALEWAVGLGGPPPA 
PPRLPFCLQELQGRHALHTFSLERTCSYQDFL 
WADEGRLLHVGAQDLATWHTLSPLGLW 


1263 


2613 


A 


9938 


247 


488 


RMSATSVDQRPKGQGNKVSVQNGS1HQKDG 
CNDDDFEPYLRSPDNQSNSYPPMSDPYMPGY 
YAPSIGFPYSLGEAAWSQL 


1264 


2614 


A 


9941 


61 


277 


ESIGLTALGPRRRPWEHRWSDPrTLKMKGWG 
WLALLLGALLGTAWARRSQDLHCGACKAVR 
RRVRQFNIYDY 


1265 


2615 


A 


9956 


2 


522 


FVASEVSICMPVPASWPHPPGPFLLLTLLLGLT 

EVAGEEELQMIQPEKULLVTVGKTATLHCTV 

TSLLPVGPVLWFRGVGPGRELIYNQKEGHFP 

RVTTVSDLTKRNNMDFSIRISSrrPADVGTYY 

CVKFRKGSPDHVEFKSGAGTELSVRGEYSVG 

FLSQVWWWLSSHPFMN 


1266 


2616 


A 


10002 


243 


387 


PKNN ACHLLFTAVCQPRCKHGEC IGPNKCKC 
HPGYAGKTCNQGRKTV 


1267 


2617 


A 


10004 


36 


707 


LPAPASTWS VARETMAS SS VPP ATVS AATAG 

PGPGFGFASKTKKKHFVQQKVKVFRAADPLV 

GVFLWGVAHSTNELSQVPPPVMLLJ'DDFKAS 

SKIKVNNHLFHRENLPSHFKFKEYCPQVFRNL 

RDRFGDDDQDYLVSLTRNPPSESEGSDGRFLIS 

YDRTLVIKEVSSEDIADMHSNLSNYHQVRPLS 

SP1LSLSSLLTYSSAIVSNRCQLGRKLIGRENP 


1268 


2618 


A 


10005 


2 


209 - 


GEGYELFVPSNGVPAVCHMVGRRPHRAVLSP 
SQDELEHSLGES AAQG AAGWLWVS WENTR 
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TKVSLGLA 


1269 


2619 


A 


10010 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHD 
LQLRNl^VADHSKTQVQKKEl^SLKRDTKAJ 
IDTGLKKTTQCPKLEDSEKEYVLDPKPPPLTL 
AQKLGLIGPPPPPLSSDEWEKVKQRSLLQGDS 
VQPCPICKEEFELRPQVFSIRG 


1270 


2620 


A 


1001 1 


2' 


588 


RVDDFVRPLPPGLMSRSRASIHRGSIPAMSYA 
PFRDVRGPSTHRTQYVHSPYDRPGWNPRFCII 
SGNQLLMLDEDEIHPLLIRDRRSESSRNFCLLR 
RTVSVPVEGRPHG EHE YHLGRSRRKS VPGGK 
QYSMEGAPAAPFRPSQGFLSRRLKSSIKRTKS 
QPKLDRTSSFKQILPRFRSADHDRYRGWSMW 
DEIDV 


1271 


2621 


A 


10013 


209 


363 


LPAPPNLSPRLSFGFQFPGGNDNYLTTTGPSHP 
FLSGAEVSQSCRRRGGRA 


1272 


2622 


A 


10014 


7 


388 


SAVTISWKWRSVMGIQTSPALLASLGAGLVT 
LLGLAVGSYLVRRSRRPQVTLLDPNEKDLLR 
L IDKTLS ARSPCKHI YLSTRIDGSLSIRPYTPVT 
SDEDQGYVD1DIKVYLKGVHPTFPEGGKMSH 


1273 


2623 


A 


10016 


1 


1339 


MAARTLGRGVGRLLGSLRGLSGQPARPPCGV 
S APRRAASGPSGSAP AV AAAAAQPGSYP ALS 
A Q AARE P AAF WG PL ARDTL VWDTP YHTV W 
DCDFSTGKIGWFLGGQLNVSVNCLDX^HVRKS 

PPWAl lU/FD PIP PTiTFVTi TTVR FT T FTTrR I A 

NTT JGiHG VHRGDRVAI YMP VSPLA VAAML A 

CARIGAVHTVIFAGFSAESLAGRINDAKCKVV 

rTFNQGLRGGRVVELKKIVDEAVKHCPTVQH 

VLVAHRTDNKVHMGDLDVPLEQEMAKEDP 

VCAPESMGSEDMLFMLYTSGSTGMPKGIVHT 

QAGYLLYAALTHKLVFDHQPGDIFGCVADIG 

WITGHSYVVYGPLCNGATSVLFESTPVYFNA 

GRYWETVERLKINQFYGAPTAVRLLLKYGD 

AWVKKYDRSSLRTLGSVGEPINCEAWEWLH 

RVVGDSRCTLVDTWWQT 


1274 


2624 


A 


10017 


1 


3750 


FRPQGTPRSPASHVLTMSAPDEGRRDPPKPKG 

KTLGSFFGSLPGFSSARNLVANAHSSARARPA 

ADPTGAPAAEAAQPQAQVAAHPEQTAPWTE 

KELQPSEKMVSGAKDLVCSKMSRAKDAVSS 

GVASVVDVAKGVVQGGLDTTRSALTGTKEV 

VS SG VTGAMDMAKGA VQG GLDTSKA VLTG 

TKDTVSTGLTGAVKVAKGTVQAGVDTTKTV 

LTGTKDTVTTGVMGAVNLAKGTVQrGVETS 

KAVLTGTKDAVSTGLTGAVNVARGSIQTGV 

DrSKTVLTGTKDTVCSGVTGAMNVAKGTIQT 

GVDTSKTVLTGTKDTVCSGVTGAMNVAKGT 

IQTGVDTSKTVLTGTKDTVCSGVTGAMNVA 

KGT1QTGVDTTKTVLTGTKNTVCSGVTGAVN 

LAKEAIQGGLDTTKSMVMGTKDTMSTGLTG 

AANVAKGAMOTGT TsnTONTATfiTKDTVr'SG 

VTGAMNLARGTIQTGVDTTKIVL TGTKDTVC 

SGVTGAANVAKGAVQGGLDTTKSVLTGTKD 

AVSTGLTGAVNVAKGTVQTGVDTTKTVLTG 

TKDTVCSGVTSAVNVAKGAVQGGLDTTKSV 

VIGTKDTMSTGLTGAANVAKGAVQTGVDTA 

KTVLTGTKJD IVTIGL VG AVN VAKGTVQTGM 

DTTKTVLTGTKDTIYSGVTSAVNVAKGAVQT 

GLKTTQNIATGTKNTFGSGVTSAVNVAKGAA 

QTGVDTAKTVLTGTKDTVTTGLMGAVNVAK 

GTVQTSVDTTKTVLTGTKDTVCSGVTGAAN 
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VAKGAIQGGLDTTKSVLTGTTCDAVSTGLTGA 

VKLAKGTVQTGMDTTKTVLTGTKDA VCSGV 

TGAANVAKGAVQMGVDTAKTVLTGTKDTV 

CSGVTGAANVAKGAVQTGLKTTQNIATGTK 

NTLGSGVTOAAXVAKGAVQGGLDTTKSVLT 

QTKDAVSTGLTGAVNLAKGTVQTGVDTSKT 

VLTGTKDTVCSGVTGAVNVAKGTVQTGVDT 

AXTVLSGAKDAVTTGVTGAVNVAKGTVQTG 

VDASKAVLMGTKDTVFSGVTGAMSMAKGA 

VOGGLDTTKTVLTGTKD AVSAGLMG SGNVA 

TGATHTGLSTFQNWLPSTPATSWGGLTSSRT 

TDNGGEQTALSPQEAPFSGISTPPDVLSVGPEP 

AWEAAATTKGLATDVATFTQGAAPGREDTG 

LLA1THGPEEAPRLAMLQNELEGLGDIFHPM 

NAEEQAQLAASQPGPKVLSAEQGSYFVRLGD 

LGPSFRQRAFEHAVSHLQHGQFQARDTLAQL 

QDCFRL 


1275 


2625 


A 


10025 


124 


415 


TILARKKEKTCPCKKEIGRNSRSGMYSRXAM 
YKRKYS AANTK VEKKKXEKV LAPVTKP VG G 
DKNGGTRVVKLPTMPRYYPTEDVPRKLLSHG 
KKPFS 


1276 


2626 


A 


10030 


3 


507 


GGSLRFSPPRVPSCSRVFCPVPPGGCGLPSPMS 

ASRPQSPTTPWCLPRRYMKHKRDDGPEKQED 

EA VDVTPVMTCVFVVMCC SML VLL Y YF YDL 

LVYWIGIFCLASATGLYSCLAPCVRRiLPFGK 

CRIP^NSLPYFHKRPQARMLLLALFCVAVSV 

VWGVFRNEDQ 


1277 


2627 


A 


10035 


51 


869 


YSRFTVPLPATMASSEVARHLLFQSHMATKT 

TCMSSQGSDDEQIKRENIRSLTMSGHVGFESL 

PDQLVNRSIQQGFCFNILCVGETGIGKLSTLIDT 

LFNTNFEDYESSHFCPNVKLKAQTYELQESN 

VQLKLTIVNTVGFGDQrNKEERQLGRSQSTEN 

PQKYRSEQI IPVEPKXCTSFWKGALGKWAGIE 

S SGQ S AQQPYLPINSPPHRL ADV AD VHLFSS V 

LSGAFGCYHLDVTVNEFKKQQNRDEQEGYS 

KGDQEQGSWfCHGADPLRGGEM 


1278 


2628 


A 


10036 


3 


457 


RAFDVRRKKSLRPCCPRDFHAGCLTVSGPST 
VMGAVGESLSVQCRYEEKYKTFNKYWCRQP 
CLPIWHEMVETGGSEGVVTlSDQVirTDHPGDL 
TFTVTLENLTADDAGKYRCGIATILQEDGLSG 
FLPDPFFQVQVLV SSASSTENS VKTP 


1279 


2629 


A 


10039 


214 


435 


NDSLVPMSSWRSCARAPSSESAWRRSAATRR 
SRKCLRTKRKRWS SGKGTQMQSTL SETPRRA 
QMPCMWWYPFWG 


1280 


2630 


A 


10043 


2 


344 


RATWHNAGKEREAVQLMAGAEKRVXASHS 
FLRGLFGGNTRIEEACEMYTRAANMFKMAK 
NWSAAGNAFCQAAKLHMQLQSKHDSATSFV 
DAGNAYKKADPQGFCTARHVACYLCV 


1281 


2631 


A 


10080 


620 


818 


VIYKLDSSLFSYF1YFFIFETESHFLPLMKWTG 
PIMAHCSLKILASRNSADSAFLSAGDTSLSHST 


1282 


2632 


A 


10084 


3 


1640 


SAS1I1RGDKRASGEVGIAPSSRH1I.IGEPSAKY 

NGTAIISLVRGPGILGEVTVFWRTFPPSVGEFA 

ETSGKLTMRDEQSAVIVVIQALNDDIPEEKSF 

YEFQLTAVSEGGVLSESSSTANITVVASDSPY 

GRF AFSHEQLR VS EAQRVNI71IRS SGDFGHVR 

LWYKTMSGTAEAGLDFVP.AAGELLFEAGEM 

RKSLHVE1LDDDYPEGPEEFSLTITKVELQGR 

GYDFT1QENGLQ1DQPPEIGN1S1VR1IIMKNDN 

AEGHEFDPKYTAFEVEEDVGLIMIPWRLHGT 
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Y G YVTADFI SQS S S ASPGG VD YILHGSTVTFQ 

HGQNLSFINIS1IDDNESEFEEFEILLTGATGG 

A VLGRHL VSR1 IIAKSDSPFG VIRFLNQSKI SIA 

NPNSTMILSLVLERTGGLLGEIQVNWETVGPN 

SQEALLPQNRDIADPVSGLFYFGEGEGGVRTII 

LTTYPHEEIEVEETFIIKLHLVKGEAKLDSRAK 

DVTLTIQEFGDPNGWQFAPETLSKKTYSEPL 

ALEGPLLITFFVRRVKGTFGE1M 


1283 


2633 


A 


10088 


316 


516 


MGSKTLPAPVPIHPSLQLTNYSFLQAVNCLPT 
VPSDHLPNLYGFSALHAVHLHQ^TLGYPAM 
HLXRS 


1284 


2634 


A 


10091 


2 


569 


WSPSRAMASALIYVSKFKSFVIL\ r VTPLLLLP 
LVILMP AKF VRC AYVI ILMAJ Y WCTE VIPLA V 
TSLMP VLLFPLFQTLD SRQVCVQYMKDTNML 
FLGGLIVAVAVERWNLHKRIALRTLLWVGA 
KP ARL MLGFMG VTA I . L SM WT S NT A TT AMMV 
PIVEAILQQMEATSAATEAGLELVDKGKAKE 
LP 


1285 


2635 


A 


10092 


290 


728 


KQSTRPDVMTLYPLHWQEEMSGESVVSSAVP 
AAATRTTSFKGTSP SSK YVKLNVGG AL YYTT 
MQrLTKQDTMLKAMFSGRMEVLTDSEGWrL 
LDRCGKHFGTTLNYLRDGAVPLPESRREIEELL 
AEAKYYLVQGLVEECQAALQV 


1286 


2636 


A 


10100 


1 


574 


RPRGRGAWAGPGQDYSGVRRQQRRRTRISGS 

QRGSDAAGTMGCCTGRCSLICLCALQLVSAL 

ERQrFDFLGFQWAPILGNFLHIIWlLGLFGTIQ 

YRPRYIMV YTV WTALWVTWN VFIICF YLE VG 

GLSKDTDLMTFNI S VHRS W WREHGPGCVRR 

VLPPSAHGMMDDYTYVSVTGCIVDFQYLEVI 

H3A 


1 7R7 
1 / 


ZOJ / 


A 

A 


IUIUj 


252 


376 


RSRMGDKPI WEQIGSSF IQHYY QLFDNDRTQL 
GAJYVSFQL 


1288 


2638 


A 


10107 


I 


478 


MEEEDESRGKTEESGEDRGDGPPDRDPTLSPS 

AFILRAIQQAVGSSLQGDLPNDKDGSRCHGL 

RWRRCRSPRSEPRSQESG GIDTATVLDMATD 

SFLAGLVSVLDPPDTWVPSRLDLRPGESEDM 

LELVAEVRIGDRDPIPLPVPSLLPRLRAWRTG 

KT 


1289 


2639 


A 


J0IJ3 


237 


438 


LLSKMPSTNRAGSLKDPEIAELFFKEDPEKLFT 

DLREIGHGSFGAAYFARDVRTNEWAIKKMS 

YSG 


1290 


2640 


A 
/\ 


in 1 

1UI It 


JO / 




RGAKAKSA VLPPGPPCS SIL1LSPPAPLTPRSPG 

TEATRPTAMSKSLKKKSHWTSKVHESVR3RN 

PEGQLGFELKGGAENGQFPYLGEVKPGKVAY 

ESGSKLVSEELLLEVNETPVAGLTIRDVLAVI 

KHCKDPLRLKCVKQGESSGLLSVLPGGGTAR 

GAGQ 


1291 


2641 "" 


A 


10116 


128 


591 


RTIRETERRSALSCSVLKSEPLPGLQPQASQQR 

LGSGRGSGGLSSQLKCFCSKRRRRRRSKRKDK 
V S I L STFL APFKHL S PG ITNTEDD DTI. ST S S AE 
VKENRNVGNLAARPPPSGDRARGOATR 


1292 


2642 


A 


10121 


1 


749 


qrrrfraglwgghgltdglrrnggcgcsar 

vprvgerlrghrcpdplcllldmlflsfhag 

sweswcccclipadrpwdrgqhwqlemadt 

rsvhetrfeaavkviqslpkngsfqptnemm 

lkfysfykqategpciclsrpgfwdpigrykw 

dawsslgdmtkeeamiayveemkkiietmp 

mtekveellrvigpfyeivedkksgrssditsd 
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LGNVLTSTPNAKTVNGKAESSDSGAESEEEE 
AC 






A 


1 U 1 £ry 


*) 


080 


PI MSI VRVVEFVAASSAOKTPSRLENYYMVC 

KADEKJFNQLVHFLRJSTHKQEKHLVFFRYSSGL 

CGRGIRDSARMCSTCACVEYYGKALEVLVK 

GVK1MCIHGKMKYKRNKIFMEFRKLQSGILV 

CTDVMARGIDIPEVNWVLQYDPPSNASAFVH 

RCGRTARIGHGGSALVFLLPNfEESYINFLAIN 

QKCPLQEMKPQRNTADLLPKLKSMALADRA 

VreKGMKAFVSYVQAYAKHECNLIFRLKDL 

DFASLARGFALLRMPKMPELRGKQFPDFVPV 

DVNTDTIPFKDKIREKQRQKLLEQQRREKTEN 


1294 


2644 


A 


10129 


91 


1042 


VTMYKDCIESTGDYFLLCDAEGPWGIILESLA 

TI rilVVTlI I T T APT FT M"R VTnTV^riWMVT PTO 

LLFLLSVLGLFGLAFAF1IELNQQTAPVR.YFLF 

GVLFALCFSCLLAHASNLVKLVRGCVSFSWT 

T1LCIAIGCSLLQIILATEYVTLIMTRGMMFVN 

MTPCQLNVDFVVLLVYVLFLMALTFFVSKAT 

FCGPCENWKQHGRLIF1TVLFSUIWVVW1SML 

LLYIVPELCILYRSCRQECPLQGNACPVTAYQ 
HSFQVENQELSRDKWKVLLNSDFLSHSGA 


1295 


2645 


A 


10133 


376 


518 


RPRWTHNSQWCFLPQDHPGWLPGQSGAPG 

v J rxvJ/Vl IW^ll V_Tr vj o 3 YV IvV^ V 


1296 


2646 


A 


10135 


3 


551 


EWSLDPFMGIMSGQVGDLSPSQEfCSLAQFRE 
N1QDVLSALFNPDDYFLLRWLQARSFDLQKS 
EDMLRKHMEFRKQQDLAN1LAWQPPEVVRL 
v>jA\rntPounr;Fr;spvwvT-iTvr;«!onPKr;T t i 

I \J 1\^\J 1MLJ KJ LZfKJ kJX V TT I 111 V VJgVL/r I\.\J1j1jLi 

SASKQELLRD3FRSCELLLRECELQSQKLGKR 
VEKnAIFGLEGLGLRDLWKPGIELLQE 


1297 


2647 


A 


10138 


48 


407 


MVSSCCGSVCSDQGCGQDLCQETCCRPSCCE 
TTCCRTTCCRPSCCVSSCCRPQCCQSVCCQPT 
CSRPSCCOTTCCRTTCYRPSCCVSSCCRPOCC 
QPVCCQPTCCRPSCCETTCCHPXCC 


1298 


2648 


A 


10156 


94 


453 


GGNRKSAEMFSQVPRTPASGCYYLNSMTPEG 
QSMYLRFTX)TTRilSPYRMSRILARHQLV r TKJ 
QQEIEAKEACDWLRAAGFPQYAQLYEDSQFP 
INIVAVKKDHDFLEKDLGEPLCRRLNT 


1299 


2649 


A 


10161 


1 


393 


PRPSELVDGRGRVSARFGGSPSKAATVRSQPT 
ASAQLENMEEAPKRVSLALQLPEHGSKDIGN 
VPGNC SENPCQNGGTCVPGAD AHSCDCGPGF 
KGRRCFXACrKVSRPCTRLFSETKAITVWEGG 
VCHHV 


1300 


2650 


A 


10162 


98 


391 


AKIASLEPJMPANYTCTRPDGDNTDFRYFIYA 
VTrrGILGPGLlGNlLALWVFYGYMKETKRA 
VlTMINLAIADLLQVLSLPLRrFYYLKHDWPF 
VPV 


1301 


2651 


A 


10165 


1 


7545 


PGIRVG 1TSQTGLSSNLQENCSKL AF1S SHGTE 

KQLQCMPMEGRGRASSSISDLQGKGFEKGTG 

EKHVPGVGSARHSPQASAGGSPWQRGKAQT 

RWLGKPDPGRKLRRRGSPQEEGGLRVSAAAR 

LLCSGANRCKVLVRQNSTPNTQQPAVHPSTP 

PSRPLPQAGRCLVAPLRPHPDWVAAKTLAKA 

LRAPGKPWRLAAPSPLGDLGAPGLPGPSTAP 

RTL S VE EPGVECNQLCL Y AD VTDP VL CLGQK 

DPGVT.GKHCEKEKISSSKELKJ-IVHAKSEPSKP 

AJlRLSESLFTVVDENKNESinEREHKRRTSTPV 

IMEGVQEETDTRDVKRQVERSEICTEEPQKQ 
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F=Pheny (alanine, G=GIycine, H=Histidine, 
I=Isoleucine, K. -Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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KSTLKNEKHLKKDDSETPHLKSLLKKEVKSS 

KEKPEREKTPSEDKLSVKHKYKGDCMHKTG 

DETELHSSEKGLKVEENIQKQSQOrTKLSSDDK 

TERKSKHRNERKLSVLGFCDGKPVSEYIIKTDE 

>fVRXENNKKERRLSAEKTKAEHKSRRS SDSK 

IQKDSLGSKQHGITLQRRSESYSEDKCDMDST 

NMDS>JLKPEEVVHKEKRRTKSLLEEKLVLKS 

KSKTQGKQVKWETELQEGATKQATTPKPD 

KEKNTEENDSEKQRKSKVEDKPFEETGVEPV 

LETASS S AHSTQKD S SHRAKLPL AKEKYKSD 

KX>STSTRLERJO.SDGHKSRSLKHSSKDIKKKD 

ENKSDDKDGKEVDSSHEK/VRGNSSLMEKICL 

SRRLCENRRGSLSQEMAKGEEKLAANTLSTP 

SGSSLQRPKKSGDMTLIPEQEPMEIDSEPGVE 

N VFEV SKTQDNRNNNSHQDIDSENMKQKT S 

ATVQKDELRTCTADSKATAPAYKPGRGTGV 

NSNSEKHADHRSTLTKXMHIQSAVSKMNPGE 

KEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPS 

LSSVTWPLRESYDPDVIPLFDKRTVLEGSTA 

STSPADHSALPNQSLTVRESEVLKTSDSKEGG 

EGFTVDTPAKASITSKRHIPEAHQATLLDGKQ 

GKVIMPLGSKLTG VI VENENITKEGGL VDMA 

KKENDLNAEPNLKQTIKATVENGKKDGIAVD 

HWGLNTEKYAETVKLKHKRSPGKVKDISID 

VERRNENSEVDTSAGSGSAPSVLHQRNGQTE 

DVATGPRRAEKTSVATSTEGKDKDVTLSPVK 

AGPATTTSSETRQSEV.AI-PCTSrFADFGLIIGT 

HSRNNPLHVGAEASECTVFAAAEEGGAWTE 

GFAESETFLTSTKEGESGECAVAESEDRAADL 

LAVHAVKIEANVNSVVTEEKDDAVTSAGSEE 

KCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEG 

TVTCTG AEGRS DNF VICS VTGAGPREERMVT 

GAGWLGDNDAPPGTSASQEGDGSVNDGTE 

GESAVTSTGITEDGEGPASCTGSEDSSEGFAIS 

SESEENGESAMDSTVAKEGTNVPLVAAGPCD 

DEGIVTSTGAKEEDEEGEDWTSTGRGNEIGH 

ASTCTGLGEESEGVLICESAEGDSQIGTWEH 

VEAEAGAA1MNANENNVDSMSGTEKGSKDT 

DICSSAKGIVESSVTSAVSGKDEVTPVPGGCE 

GPMTSAASDQSDSQLEKVEDTTISTGLVGGS 

YDVLVSGEVPECEVAHTSPSEKEDEDIITSVE 

NECCDGLMATTASGDITNQNSLAGGKNQGK 

VLI1STSTTNDYTPQVSAITDVEGGLSDALRTE 

ENMEGTRVTTEEFEAPMPSAVSGDDSQLTAS 

RSEEKDECAMISTSIGEEFELPISSATTTKCAES 

LQPVAAAVEERATGPVLISTADFEGPMPSAPP 

EAESPLASTSKEEKDECALISTSTAEECEASVS 

G WVESENERAGT VMEEKDGSGUSTS S VEE>C 

EGPVSSAVPQEEGDPSVTPAEEMGDTAMISTS 

TSEGCEAVMIGAVLQDEDRLTITRVEDLSDA 

A1ISTSTAECMPISASIDRHEENQLTADNPEGN 

GDLSATEVSKHKVPMPSLIAENNCRCPGPVR 

GGKEPGPVLAVSTEEGHNGPSVHKPSAGQGH 

PSAVCAEKEEKHGKECPEIGPFAGRGQKESTL 

HLrNAEEKNVLLNSLQKEDKSPETGTAGGSST 

ASYSAGRGLEGNANSPAHLRGPEQTSGQTAK 

DSSVSSIRYLAAVNTGAIKADDMPPVQGTVA 

EHSFLPAEQQGSEDNLKTSTTKCITGQESKIAP 
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SHTMJPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEK.TGDDNSTRK.SFPEEGDIMVTVS 

SEENVCDIGNEESPLNVLGGLKLKANLKMEA 

YVPSEEEKNGEILAPPESLCGGKPSGIAELQRE 

PLLVNESLNVENSGFRTNEEIHSESYNKGE1SS 

GRKDNAEAISGHSVEADPKEVEEEERHMPKR 

KRKQHYLSSEDEPDDNPDVLDSRIETAQRQC 

PETEPHATKEENSRDLEELPKTSSETN STTSRV 

MEEKDEYSSSETTGEKPEQNDDDTIKSQE 


1302 


2652 


A 


10167 


321 


842 


EFSLFPFLRPSPARPPPRPPAPFPSPELAGPEPH 

FVFYFFLSYVHPPKELAKYEYMEEQVILTEKG 

NSTVAGRGTSVRCLSPSPRPLPPLLPLLADLLE 

DGFGEHPFYHCLVAEVPKEHWTPEGNPSPFP 

EARETKC YVR S S VGCVEPLTTQAE VTENLDR 

KNSQQVFKLLKKK 


1303 


2653 


A 


10171 


206 


429 


NMILLKKRRXXINSLGEGTINGLLDELLETNV 
LSQEDTEIVK.CENVTV1DKARDLLDSVIRKGA 
RACEICITY1 


1304 


2654 


A 


10184 


970 


1524 


LCTL SPGI SGTAG SCLTTEPGTELGTSF AQNGF 

YHE A V VLFTQALKLNPQDHRL FGNRSFCHER 

LGQPAWALADAQVALTLRPGWPRGLFRLGK 

ALMGLQRFREAAAVFQETLRGGSQPDAAREL 

RSCLLHLTLQGQRGGICAPPLSPGALQPLPHA 

ELAPSGLPSLRCPRSTALRSPGLSPLLH 


1305 


2655 


A 


10194 


2 


394 


TDLLGRRFRVDGAAMAACEGRRSGALGSSQ 

SDFLTPPVGGAPWAVATTVVMYPPPPPPPHR 

DFISVTLSFGESYDNSKS\MIRRSCWRKWKQL 

SRLQRNMILFLLAFLLFCGLLFYTNLADHWKG 

IRNTCT 


1306 


2656 


A 


I0I95 


1 


410 


IPGSTISLEGPLSKWTNVMKGWQYRWFVLDY 
NAGLLSYYTSKDKMMRGSRRGCVRLRGAVI 
GIDDEDDSTFT] TVD QKTFHFQ ARD ADEREK 
WIHALEETILRHTLQLQVRVFTWFPDSSLVGA 
FFFWLVSGFFFK 


1307 


2657 


A 


10205 


85 


308 


QGLPSTM VK.LG CSFSGKPGKDPGDQDG AAM 
DS VPLISPLDI S QLQPPLPDQ WIKTQTE YQLS 
SPDQQNYTKSR 


1308 


2658 


A 


10214 


2 


453 


ECGGIRQPGPGPPPALASAPAATMNRVGGSPS 
AAANYLLCTNCRKVLRKDKRIRVSQPLTRGP 
SAFIPEKEWQANTVDERTNFLVEEYSTSGRL 
DNITQVMSLHTQYLESFLRSQFYMLRMDGPL 
PLPYRHYIAIMAAARHQCSYLINM 


1309 


2659 


A 


10233 


45 


421 


RGWPEQQSTGRPRDVARQPRCQKEEGRRLRP 

RALESRTFQGSERSRWGPPLESTKENVQCGH 

RPAFPNSSWLPFHERLQVQNGECPWQVS1QM 

SRKHLCGGSILHWWWVLTAAHCFRRTLLDM 

AV 


1310 


2660 


A 


10241 


243 


442 


AFQLFNAKCESAFLSKRNPLQRNWTVLYRRK 
HKKGQSAEIQKKRTRRAFKFQRAITGASLADI 
MAX 


1311 


2661 


A 


10261 


751 


176 


LPGADYG GGHLSLRLFHLLLTS AA WVPDESQ 

VTLNSAICVLSTVLIMEFPDLGKHCSEKTCKQ 

LDFLPVKCDACKQDFCKDHFPYAAHKCPFAF 

QKDVHVPVCPLCNTPIPVKKGQIPDVWGDin 

DRDCDSHPGKJCKEK1FTYRCSKEGCKKKEML 

QMVCAQCHGNFCIQHRHPLDHSCRHGSRPTI 

KAG 


1312 


2662 


A 


10270 


3 


669 


STSSDEGSPSASTPMINKTGFKFSAEKPVIEVP 
SMTILDKKDGEQAKALFEKVRKFRAHVEDSD 
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LIYKLYWQTVDCTAKF1FILCYTANFVNAISF 

EHVCKPKVEHLIGYEVFECTHNMAYMLKKL 

LISYlSnCVYGFlCLYTLPWLFRIPLKEYSFEKV 

REESSFSDCPDVKNDFAFLLHMVDQYDQLYS 

KRFGVFLSEVSENKLREISLNHEWTFEKL 


1313 


2663 


A 


10287 


1221 


266 


GAHRVLSPAQGAQPRLRSAASVEVSMVGQR 

VLLLVAFLLSGVLLSEAAKILTISTLGGSHYLL 

LDRVSQILQEHGHNVTMLHQSGKFLIPDIKEE 

EKSYQVIRWFSFEDHQKRIKKHFDSYIETALD 

GRKESEALVKXMEIFGTQCSYLLSRKDIMDSL 

KNENYDLVFVEAFDFCSFL1AEKLVKPFVAIL 

PTTFGSLDFGLPSPLSYVPVFPSLLTDHMDFW 

GRVKNFLMFFSFSRSQWDMQSTFDNTIKEHF 

PEGSRP VLSHLLLKAEL WFVN SDCAFDFA RPL 

LPNTVYIGGLMEKPIKPVPQVSEPSAFSLGFT 


1314 


2664 


A 


10288 


536 


1890 


NVQLAKFSSTLVFFFSCDADPSALAKYVLAL 
VKKDKSEKELKALC1DQLDVFLQKETQIFVEK 
LFDAVNTKSYLPPPEQPSSGSLKVEFFPPQEK 
DIKKEErTKEEEREKKFSRRLNHSPPQSSSRYR 

RYNRRRGRSRS YSRS RSRS W SKERLRERDRD 

RSRTRSRSRTRSRERDLVKPKYDLDRTDPLEN 

NYTPVSSVPSISSGHYPVFTLSST1TVIAPTHHG 

Nrm-ESWSEFHEDQVDHNSYVRPPMPKKRC 

RDYDEKGFCMRGDMCPFDHGSDPVWEDVN 

LPGMQPFPAQPPVVEGPPPPGLPPPPP1LTPPPV 

NLRPPVFPPGPLPPSLPPVTGPPPFLPPLQPSG 

MDAPPNSATSSVPTVVTTG1HHQPPPAPPSJLFT 

ADTYDTDG YNPEAP SITNTSRPMYRHRVHPR 


1315 


2665 


A 


10293 


447 


1331 


SHPLL SCPEKVS AKLRAAAE AAAEERRTRGA 

GSRGICAGLRSVAPGPEPLKQEEGRREWCjSSI 

GTPSPCGSAQAAAAAAAEEATEKIPALRPALL 

W ALL AL WL CCA TP AHALQCRDGYEPCVNEG 

MCVTYHNGTGYCKCPEGFLGEYCOHRDPCE 

KNRCQNGGTCVAQAMLGKATCRCASGFTGE 

DCQYSTSHPCFVSRPCLNGGTCHMLSRDTYE 

CTCQVGFTGRNPKCPGGNLNYQFNGIIWYS 

GGSVPPSGTKTSKPAEHNAMGTGSKNFASGT 

LWVMVSGATSTSTSTL 


1316 


2666 


A 


10294 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTG 

YSLVQENGQRKYGGPPPGWDAAPPERGCEIFI 

GKLPRDLFEDELIPLCEKIGKIYEMRMMMDF 

NGNNRGYAFVTFSNKVEAKNAIKQLNNYEIR 

NGRLLGVCASVDNCRLFVGGIPKTKK 


1317 


2667 


A 


10301 


158 


1956 


LLXSCGVJLLSGVCIPCEGKGPTVLVIQTAVPQ 

DRPTKSSMRSAAKPWNPAIRAGGHGPDRVRP 

LPAASSGMKSSKSSTSLAFESRLSRLKRASSE 

DTLNKPGSTAASGVVRJLKKTATAGAISELTES 

RLRSGTGAFTTTKRTGIPAPREFSVTVSRERSV 

PRGPSNPRKSVSSPTSSNTPTPTKHLRTPSTKP 

KQENEGGEKAALESQVRELLAEAKAKDSEIN 

RLRSELKKYKEKRTLNAEGTDALGPNVDGTS 

VSPGDTEPMIRAIJEEKNKMFQKELSDLEEENR 

VLKEKLrYLEHSPNSEGAASHTGDSSCPTSlTQ 

ESSFGSPTGNQLSSDIDEYKKNIHGNALRTSG 

SSSSDVTKASLSPDASDFEHITAETPSRPLSSTS 

NPFKSSKCSTAGSSPNSVSELSLASLTEKJQKM 

EENHHSTAEELQATLQELSDQQQMVQELTAE 
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Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine t H=Histidine, 
I=Iso leucine, K=Lysine, L=Leucine, 
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MNLLQEK.VKNEEPTTQEGKIIELEQKCTGILE 
QGRFEREKLLNIQQQLTCSLRKVEEENQGAL 
EM1KJU.K£ENEKXNEFLELER11NNNMMAKTL 
EECRVTLEGLKMENGSLKSHLQG 


1318 


2668 


A 


10303 


333 


879 


GECFIMAAWQQNDLVFEFASNVMEDERQL 
npiP a fpp A vivPHA/pn a put niqvap.i a rVFPP 

UUr nlr i f\ V 1 V Citl V rVJ/vUlLdT O I /\VJl_./\.l^, V Jc,Jc_r 

NDMTTESSLDVAEEEiroDDDDDITLTVEASCH 
DGDETTETTEAAEALLNMDSPGPMLDEKRTNN 
NIFSSPEDDMWAPVTHVSVTLDGIPEVMETQ 
QVQEKYADSPGASSPEQPKRKKX 


1319 


2669 


A 


10322 




654 


i #t \ m \ 4<2t~* c\7 a \ rrD ata\^>^t t t t t tiatai ct 
MbYKMoOaVAV J KAiAVrUJLXLLLlJAi AJ_oL 

LIGAKSLPASWLEAFSGTCQSADCTTVt.DAR ! 

LPRTLAGLLAGGALGLAGALMQTLTRNPLAD i 

PGLLGVNAGASFAIVLGAALFGYSSAQEQLA | 

MAFAGALVASLIVAFTGSQGGGQLSPVRLTL 

AU V XL, 


1320 


2670 


A 


10323 


441 


2 


KMNQVAWIGGGQTLGAFLCHGLAAEGYRV 1 
AWDIQSDKAANVAQFJNAEYGESMAYGFG 
ADATSEQSVLALSRGVDEIFGRVDLLVYSAGI 
AKAAFI SDFQLGDFDRSLQ VNLVG YFLC ARE 


1321 


2671 


A 


10332 


1 


453 


RHRTAGPGSTISSRTDSASAPAARAMPCEYTY 
AKLTS DCSRPSLQWYTRAQ SKMRRPRLLLKD 
ILKCTLLVFGVRILY1LKLNYTTEECDMKNMH 
Y VDi*L>tLVKKAyKY At^JVU^Kh.at'fKr AK L S> 
MALLFEHRYSVDLLPFVQKAPTDSEA 


1322 


2672 


A 


10333 


25 


423 


EPSNGPVVYSALGNEDDEILLLGKDIIGTFAAS 

ERKMRAHQVLTFLLLFVITSGASENASTSRGC 

GLDLLPQNVYLCDLDAIWGIWEAVAGAGA 

LITLLLMLILLGRLPFlKEKEKiCSPAVLHFLFL 

LGTLG 


1323 


2673 


A 


10334 


52 


426 


SSLGNEDDEILSLAKDITG>4FVASHRKMRAH 
QVLTFLLLFVrrSVASENASTSRGCGLDLLPQ 
YVSLCDLDAIWGIWEAAAGAGALITLLLMLI 


1324 


2674 


A 


10336 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEE 

N SVTHHEVKCQGKPL AGI YRKREEKRN AGN 

AVRSAMKSEEQKUCDARKGPLVPFPNQKSEA 

AEPPKTPPSSCDSTNAAIAKQALKKP1KGKQA 

PRKKAQGKTQQNRKLTDFYPVRRS SRKSKAE 

LQSEERKR1DELIESGKEEGMKIDLIDGKGRG 

VIATKQFSRGDFWEYHGDLIEITDAKKREAL 

YAQDPSTGCYMYYFQYLSKTYCVDATRETN 

RLGRLrNHSKCGNCQTKLHDlDGVPHLILIAS 

t&AJLf\f\.KJUixLL/Ls I L/ I VJ 1_/I\_> r_/^*-?l r._r\_ll VV J_.lN.ll 


1325 


2675 


A 


10338 


3 


870 


PGSTISCSELKGTQCRATAG SRGRRPPMTC WL 

RGVTATFGRPAEWPGYLSHLCGRSAAMDLG 

PMRKSYRGDREAFFFTHT TSLDPVKOFA AWF 

EEA VQCPDI GEANAMCLATCTRDGKFSARML 

LLKGFGKDGFRFFTNFESRKGKELDSNPFASL 

VFYWEPLNRQVRVEGPVKKLPEEEAECYFHS 

RPKSSQ1GAVVSHQSSVTPDREYLRKKNEELE 

QLYQDQEVPKPKSWGGYVLYPQVMEFWQG 

QTNRLHDPJVFRRGLPIGDSPLGPMTI1RGEE 

DWLYERLAP 


1326 


2676 


A 


10344 


2 


9R4 


ARAAAHCGTCRLVRWWRKRRSVMGIQTSPV 
LLASLGVGLVTLLGLAVGSYLVRRSRRPQVT 
LLDPNEKYLLRLLDKTTVSHNTKRFRFALPTA 
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HHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSD 

EDQGYVDLVIKVYLKGVHPKFPEGGKMSQY 

LDSLKVGDWEFRGPSGLLTYTGKGHFNIQP 

r^KKSPPEPRVAKKLGMIAGGTGITPMLQLIRA 

ILKVPEDPTQCFLLFANQTEKD1ILREDLEELQ 

ARYPNRFKLWFTLDHPPKDWAYSKGFVTAD 

MIREH LPAPGDDVL VLLCOPPPMVQLACHPN 

LDKLGYSQKMRFTY 


1327 


2677 


A 


10345 | 1 


968 


LQS AG EG VTHVLILLESPARP V AA VTQ VQRR 

RYHRLSDMSMLAERRRKQKWAVDPQNTAW 

SNDDSKFGQRMLEKMGWSKGKGLGAQEQG 

ATDHIKVQVKNNHLGLGATINNEDKWIAHQ 

DDFNQLLAELNTCHGQETTDSSDKKEKKSFS 

LEEKSKISKNRVHYMKFTKGKDLSSRSKTDL 

DCIFGKJIQSKKTPEGDASPSTPEENETTTTSAF 

TIQF.YFAKPvMAALKNKPQVPVPGSDISETQVE 

RKRGKKRNKEATGKDVESYLQPKAKRHTEG 

KPERAEAQERVAKKKSAPAEEQLRGPCWDQ 

S SKASAQDAGDHVQPA 


1328 


2678 


A 


10346 j 173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGI 
CRMAFNGCCPDCKVPGDDCPLVWGQCSHCF 
HrvTHCIIJCWLflAQQVQQHCPMCRQEWKFKE 


1329 


2679 


A 


10351 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFL 

S MYL VT VLGNLLULATI SDSHLHTPM YFFLSN 

LSFADICVTSTTrPKMLMNIQTQNKVTTYIACL 

MQMYFFILFAGFENFLLSVMAYDRFVAICHP 

LHYMV1MNPHLCGLLVLASWTMSALYSLLQI 

LMVVRLSFCTALEIPHFFCELNQVIQLACSDSF 

LNHMVIYFTVALLGGGPLTGILYSYSKHSSIH 

AISS AQG KYKAFSTCASHLS W SLFYG AILG V 

YLSSAATRNSHSSATASVMYTWTPNfLNPFI 

YSLRNKDIKRALGJHLLWGTMKGQFFKKCP 


1330 


2680 


A 


10352 


34 


2573 


1PFLKSCCCCCLFDFPPPPLDQVQEEECEVERV 

TEHGTPKPFRXFDSVAFGESQSEDEQFENDLE 

TDPPNWQQLVSREVLLGLKPCEIKRQEVINEL 

FYTERAHVRTLKVLDQVFYQRVSREGILSPSE 

LRKIFSNLEDn.QLHIGLNEQMKAVRKRNETS 

VTDQIGEDLLTWFSGPGEEKLKHAAATFCSNQ 

PFALEMIKSRQKKDSRFQTFVQDAESNPLCRR 

LQLKDnPTQMQRLTKYPLLLDNIATYTEWPT 

ERFJCVKKAADHCRQaNYVNQAVKEAENKQ 

RLEDYQRRLDTSSLKLSEYPNVEELRNLDLTK 

RKMIHEGPL VWK VNRDKTTDL YTLLLEDIL V 

LLQKQDDRLVLRCHSKILASTADSKHTFSPV1 

KXSTVLVRQVATDNKALFVISMSDNGAQIYE 

LVAQTVSEKTVWQDLICRMAASVKEQSTKPl 

PLPQSTPGEGDNDEEDPSKLKEEQHGISVTGL 

QSPDRDLGLESTLISSKPQSHSLSTSGKSEVRD 

LFVAERQFAKEQrTrDGTLKEVGEDYQlAJPDS 

HLPVSEERWALDALRNLGLLKQLLVQQLGLT 

EKS V QEDWQHFPRYRTASQGPQTDS VIQNSE 

N1KA YHSGEGHMPFRTGTGDIATCYSPRTSTE 

SFAPRDSVGLAPQDSQASNfLVMDHMIMTPE 

MPTMEPEGGLDDSGEHFFDAREAHSDENPSE 

GDGAVNKEEKDVNLRISGNYLILDGYDPVQE 

SSTDEE VAS SLTXQPMTGIPA VESTHQQQHSP 

QNTHSDGAISPFTPEFLVQQRWGAMEYSCFEI 

QSPSSCADSQSQIMEY1HKIEADLEHLKKVEE 

SYTTLCQRLAGSALTDKHSDKS 
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1331 


2681 


A 


10353 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEG 

AAGQQPTAPDKS KETNKTDNTE AP VTKIELLP 

SYSTATLIDEPTEVDDPWNLPTLQDSGIKWSE 

RDTKGKILCFFQGIGRLILLLGFLYFFVCSLDIL 

SSAFQLVGGKMAGQFFSNSSIMSNPLLGLVIG 

VLVTVLVQSSSTSTSIVVSMVSSSLLTVRAAIP 

IIMGANIGTSITNTIVALMQVGDRSEFRRAFA 

GATVHDFFNWLSVLVLLPVEVATHYLEnTQL 

IVESFHFKNGEDAPDLLKVITKPFTKLIVQLDK 

K.V1SQIAMNDEKAKNKSLVKIWCKTFTNKTQ 

IN VT VPSTANCTS PS LC WTDG IQNWTMKN VT 

LCGCLIMIVKILGSVLKGQVATVIKKTTNTDFP 

FPFAWLTGYLAILVGAGMTFIVQSSSVFTSAL 

TPUGIGVITIERAYPLTLGSNIGTTTTAILAAL 

ASPGN ALRSSLQIALCHFFFNI SGILL WYPIPFT 

RLPIRMAKGLGN1SAKYRWFAVFYLIIFFFLIP 

LTVFGLSLAGWRVLVGVGVPWFIIILVLCLR 

LLQSRCPRVLPKKLQNWNFLPLWMRSLKPW 

DAWSFCFTGCFQMRCCCCCRVCCRACCLLC 

GCPKCCRCSKCCEDLEEAQEGQDVPVKAPET 

FDNIT1SREAQGE VPASDSKTECTAJL 


1332 


2682 


A 


10354 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPR 
GSQGKLRRVLVPMSVKPSWGPGPSEGVTAVP 
TSDLGE1HN WTELLDLFNHTLSECFfVELSQST 
KRVVLFALYLAMFVVGLVENLLVICVNWRG 

3UK-tt.l_I.LIvl.PiL, I ILlNlVlrtl rtJJLUl VLSLr V VV IVli^t- 

VTLDYTWLWGSFSCRFTHYFYFVNMYSS1FF 

LVCLSVDRYVTLTSASPSWQRYQHRVRRAM 

CAG1WVLSAIIPLPEWHIQLVEGPEPMCLFM 

APFETYSTWALAVALSTTILGFLLPFPLITVFN 

VLTACRLRQPGQPKSRRHCLLLCAYVAVFV 

MCWLPYHVTLLLLTLHGTHISLHCHLVHLLY 

FFYDVIDCFSMLHCVrNPILYNFLSPHFRGRLL 

NAVVHYLPKDQTKAGTCASSSSCSTQHSinT 

KGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TOPLTPS 


1333 


2683 


A 


10358 


2 


884 


AAG AG ADG REP ASERASRAEPP A VAMGQND 

LMGTAEDFADQFLRVTKQYLPHVARLCLIST 

FLEDGIRMWFQWSEQRDYIDTTWNCGYLLA 

SSFVFLNLLGQLTGCVLVLSRNFVQYACFGLF 

Gil ALQT1A YS IL WDLKFLMRNL ALGGGLLLL 

LAESRSEGKSMFAGVPTMRESSPKQYMQLGG 

RVLLVLMFMTLLHFDASFFSIVQNIVGTAI.MI 

LVAJGFKTKLAALTLVVWLFArNVYFNAFWT 

IPVYKPMHDFLKYDFFQTMSVIGGLLLWAL 

GPGGVSMDEKKKEW 


1334 


2684 


A 


10367 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFP 

ELPLPHVPGQESAKRRSARRFLIMSELTKELM 

EL VWGTKSS PGL SDTIFCRWTQGF VFSESEGS 

ALEQFEGGPCAVIAPVQAFLLKKLLFSSEKSS 

WRDCSQEEQKELLCHTLCDILESACCDHSGS 

YCLVSWLRGKTTEETAStSGSPAESSCQVEHS 

SALAVEELGFERFHALIQKRSFRSLPELKDAV 

LDQYSMWGNKFGVLLFLYSVLLTKGIENTKN 

EIEDASEPLIDPVYGHGSQSLINLLLTGHAVSN 

VWDGDRECSGMKLLGIHEQAAVGFLTLMEA 

LRYCKVGSYLKJSKJPYLDCLASETHLTVFFA 

KDMALVAPEAPSEQARRVFQTYDPEDNGnP 

DS1XEDVMKALDLVSDPEYINLMKNKLDPEG 
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LGIILLGPFLQEFFPDQGS SGPESFTVY H YNGL 
KQ SN YNEK VM Y V EGT A V V MGFEDFMLQTD 
DTPEKRCLQTKWPYIELLWTTDRSPSLN 


1335 


2685 


A 


10375 


82 


2929 


TRTKRRLGREKAMA SPPRG W GCGELLLPFML 

LGTLCEPGSGQIRYSMPEELDKGSFVGNIAKD 

LGLEPQELAERGVRIVSRGRTQLFALNPRSGS 

LVTAGRIDREELCAQSPLCVVNFNILVENKM 

KrYGVEVEIIDINDNFPRFRDEELKVKVNENA 

AAGTRLVLPFARDADVGVNSLRSYQLSSNLH 

FSLDVVSGTDGQKYPELVLEQPLDREKETVH 

DLLLTALDGGDPVLSGTTH1RVTVLDANDNA 

PLFTPSEYSVSVPENIPVGTRLLMLTATDPDE 

GINGKLTY SFRNEEEKISETFQLD SNXGEISTL 

QSLDYEESRFYLMEWAQDGGALVASAKVV 

VTVQDVNDNAPEVILTSLTSSISEDCLPGTVIA 

LFSVHDGDSGENGEIACSIPRNLPFKLEKSVD 

NYYHLLTTRDLDREETSDYNITLTVMDHGTP 

PL STESHIPLKV ADVNDNPPNFPQ AS YSTS VT 

ENNPRGVSIFSVTAHDPDSGDNARVTYSLAE 

DTFQGAPLSSYVSINSDTGVLYALRSFDYEQL 

RDLQLWVTASDSGNPPLSSNVSLSLFVLDQN 

DNTPEILYPALPTDG STG VEL APRS AEPG YLV 

TKWAVDKDSGQNAWLSYRLLKASEPGLFA 

VG LHTGEVRTAR ALLDRDALKQ SL WAVED 

HGQPPLSATFTVTVAVADRIPDILADLGSIKTP 

IDPEDLDLTLYLVVAVAAVSCVFLAFVIVLLV 

LRLRRWHKSRLLQAEGSRLAGVPASHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNY 

ADTL L SEE SCEKSEP LLM S DKVD AN KEERR V 

QQAPPNTDWRFSQAQRPGTSGSQNGDDTGT 

WPNNQFDTEMLQAMILASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYR 

QNVYIPGSNA-TLTNAAGKRDGKAPAGGNGN 

KKKSGKKEKK 


1336 


2686 


A 


10379 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEK 
L AKJLQAQ VRIGGKGTARRKKK WHRTATAD 
DKKLQSSLKJCLAVNN1AGIEEVNM1KDDGTVI 
HFNNPKVQ ASLSANTFAITGl LAEAKPITEMLP 
GILSQLGADSLTSLRKLAEQFPRQVLDSKAPK 
PEDIDEEDDDVPDLVENFDEASKNEAN 


1337 


2687 


A 


10380 


1 


1263 


IPG STISW SPAAARGLS VCRCCRLHP AS AMDL 

FGDLPEPERSPRPAAGKEAQKGPLLFDDLPPA 

SSTDSGSGGPLLFDDLPPASSGDSGSLATSISQ 

MVKTEGKGAKRKTSEEEKNGSEELVEKKVC 

KASSVTFGLKGYVAERKGEREEMQDAHVILN 

DITEECRPPSSLITRVSYFAVFDGHGGIRASKF 

AAQNI.HQNLIRKFPKGDV1SVEKTVKRCLLD 

TFKHTDEEFLKQASSQKPAWKDGSTATCVLA 

VDN3LYIANLGDSRAILCRYNEESQKHAAL SL 

SKEHNPTQYEERMRIQKAGGKVRDGRVLGV 

LEVSRS1GDGQYXRCGVTSVPDIRRCQLTPND 

RFrLLACDGLFKVFTPEEAVNFILSCLEDEKIQ 

TREGKS AADARYEAACNRLAhfKA VQRG SAD 

NVTVMWRIGH 


1338 


2688 


A 


10385 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHG 

YPaiTEELLRSQLYPEVPPEEFRPFLAKMRGIL 

KSIASADMDFMQLEAFLTAQTKKQGGITSDQ 

AAVISKFWKSHKTKlRESLNfKQSRWNSGLRG 

LSWRVDGKSQSRHSAQIHTPVAIIELELGKYG 

QESEFLCLEFDEVKVNQILKTLSEVEESISTLIS 
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eotide 
seq- 
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SEQID 
NO: of 
peptide 
seq- 
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hod 


SEQ 
ID NO: 
in 
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914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
F-PhenylaJanine, GKjlycine, H-Histidine, 
l=Iso leucine, K=Lysine. L=Leucine, 
M=Methionine t N=Asparagine, P=Proiine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/"possible nucleotide deletion, V-possible 
nucleotide insertion 














QPN 


1339 


2689 


A 


10386 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDG 
KCVICDSYVRPCTLVRJCDECNyGSYQGRCVI 
CGGPGVSDAYYCKECTIQEKDRDGCPKIVNL 
GSSKTDLFYERKKYGFKKR 


1340 


2690 


A 


10388 


113 


3472 


SQLRKGASATHSSPSRTDCIAQMMDI Y V CLK 

RP S WM VDNKRMRT ASNFQ WL L STFILL YL M 

NQVNSQKKGAPHDLKCVTNNLQVWNCSWK 

APSOTGRGTDYEVCIENRSRSCYQLEKTSIKIP 

ALSHGDYEITINSLHDFGSSTSKFTLNEQNVSL 

IPDTPEILNLSADFSTSTLYLKWNDRGSVFPHR 

SNVIWEIKVLRKESMELVKJLVTHNTTLNGKD 

TLHHWSWASDMPLECAIHFVEIRCYIDNLHFS 

GLEEWSDWSPVKNISWTPDSQTKVFPQDKVIL 

VGSDITFCCVSQEKVLSALIGHTNCPLIHLDGE 

NVAIKIRNISVSASSGTNVVFTTEDNIFGTVIF 

AGYPPDTPQQLNCETHDLKEIICSWNPGRVTA 

LVGPRATSYTLVESFSGKYVRLKRAJEAPTNES 

YQLLFQMLPNQEIYNFTLNAHNPLGRSQSTIL 

VNITEKVYPHTPTSFKVKDINSTAVKLSWHLP 

GNFAKINFLCEIEIKKSNSVQEQRNVTIKGVE 

NSSYLVALDKLNPYTLYTFR1RCSTETFWKW 

SKWSNKFCQHI.TTFASPSKGPDTWRFAVSSDG 

KNLIIY WKPLPINE ANGKIL SYNVSCSS DEETQ 

SLSEIPDPQHKAEIRLDKNDY1ISWAKNSVGS 

SPPSK1ASMEIPNDDLKIEQVVGMGKGILLTW 

HYDPNMTCDYVIKWCNSSRSEPCLMDWRKV 

PSNSTETVTESDEFRPGIRYNFFLYGCRNQGY 

QLLRSMIG YTEELAPIV APNFTVEDTS AD S ILV 

KWEDIPVEELRGFLRGYLFYFGKGERJDTSK.M 

RVLESGRSDIKVKNITDISQKTLRIADLQGKTS 

YHLVLRAYTDGGVGPEKSMYVVTKENSVGL 

IIAIHPVAVAVrv'GVVTSILCYRKREW[KETFY 

PDIPNPENCKALQFQK S VCEGSSALKTEEMNP 

CTPNNVEVLETRSAFPKIEDTErVSPVAERPEN 

RSDAKPENHVVESYCPPIIEEEIPNPAADETGG 

TAQVIYIDVQSMYQPQAKPEEEQENDPVGGA 

GYKPQMHLPINSTVEDIAAEKDLDKTAGYRP 

QANVNTWNLVSPDSPRSIDSNSEIVSFGSPCSI 

NSRQFLIPPKDEDSPKSNGGGWSFTNFFQNKP 

ND 


1341 


2691 


A 


10392 


1 


5057 


MLPPKHLSATKPKKSWAPNLYELDSDLTKEP 

DVIIGEGPTDSEFFHQRFRNLIYVEFVGPRKTL 

IKLRNLCLDWLQPETRTKEEIIELL VLEQ YLTI I 

PEKLKPWVRAKKPENCEKLVTLLENYKEMY 

QPEGESLHGVLWSAGLRCPLGLSASTLLTW 

SGLDNSLSWAAVGMSCVLWDIELHHDFLGV 

ATKSVSTHAQGDAAQGLGGTIVRMWARDSN 

LATGVLLDDNNSDVTSDDDMTRNRRESSPPH 

SV1 ISFSGDRDWDRRGRSRDTEPRDRWSI ITR 

NPRSRMPPRDLSLPWAKTSFEMDREDDRDS 

RAYESRSQDAESYQNWDLAEDRKPHNTIQD 

NMENYRKLLSLGVQLAEDDGHSF1MTQGHSS 

RSKRSAYPSTSRGLKTMPEAKKSTHRRGICED 

ESSHGVTMEKFIKDVSRSSKSGRARESSDRSQ 

RFPRMSDDNWKDISLNKRESVIQQRVYEGNA 

FRGGFRFNSTLVSRKRVLERKRRYHFDTDGK 

GSIHDQKGCPRKKPFECGSEMRKAMSVSSLS 

SLSSPSFTESQPIDFGAMPYVCDECGRSFSVIS 

EFVEHQIMHTRENLYEYGESFIHSVAVSEVQK 
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NO: of 
nucl- 
eotide 
seq- 
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SEQ ID 
NO: of 
peptide 
seq- 
uence 
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hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
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Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A"=Alaninc C— Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, (Xilycine, H=Histidine, 
I=Isoleucine, Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Prolme, 
Q=Glutamine, R^Arginine, S-Serine, 
T=Threonine, V—Valine, \V— Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codoo, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














SQVGGKRFECKDCGETFNKSAALAEHRKTHA 

RGYLVECKNQECEEAFMPSPTFSELQKIYGK 

DKFYECRVCKETFLHSSALIEHQK1HFGDDKD 

NEREHERERERERGETFRPSPALNEFQKMYG 

KEKMYECKVCGETFLHSSSLKEHQKIHTRGN 

PFENKGKVCEETFIPGQSLKRRQKTYNKEKLC 

DFTDGRD AFMQ S SELSEHQKIHSRKNLFEGR 

GYEKSV1HSGPFTESQKSHTITRPLESDEDEKA 

FTISSNPYENQKIPTKENVYEAKSYERSVIHSL 

AS VEAQKSH S V AGPSKPKVMAESTIQSFD AIN 

HQRVRAGGNTSEGREYSRSVIHSLVASKPPRS 

HNGNELVESNEKGESSIY1SDLNDKRQKIPAR 

ENPCEGGSKKRNYEDSVIQSVrT^AKPQKSVP 

GEGSGEFKKDGEFSVPSSNVREYQKARAKKK 

YIEHRSNETSVIHSLPFGEQTFRPRGMLYECQ 

F.CGECFAHSSDLTEHQKIHDREKPSGSRNYE 

WSVIRSLAPTDPQTSYAQEQYAKEQARNKCK 

DFRQFFATSEDLNTNQKTYDQEKSHGEESQGE 

NTDGEETHSEETHGQETIEDPVIQGSDMEDPQ 

KDDPDDKIYECEDCGLGFVDLTDLTDHQKVH 

SRKCLVDSREYTHSV1HTHSISEYQRDYTGEQ 

LYECPKCGESFIHSSFLFEHQRIHEQDQLYSM 

KGCDDGFIALLPMKPRRNRAAERNPALAGSA 

IRCLLCGQGFIHSSALNEHMRLHREDDLLEQS 

ESFVNPAELADHVTVHKNEPYEYGSSTTHTS 

FLTEPLKGAIPFYECKDCGKSFrHSTVLTKHKE 

LHLEEEEEDEAAAAAAAAAQEVEANVHVPQ 

VVL RIQGLNVE AAEPE VEAAEPEVEAAEPE V 

EAAEPNGEAEGPDGEAAEPIGEAGQPNGEAE 

QPNGDADEPDGAGIEDPEERAEEPEGKAEEPE 

GDADEPDGVGIEDPEEGEDQEIQVEEPYYDC 

HECTETFTS ST AFSEHLKTHASMIIFEPANAFG 

ECSGYIERASTSTGGANQADEKYFKCDVCGQ 

LFNDHLSLARHQNTHTG 


1342 


2692 


A 


10393 


2 


1350 


GRPRS SSDNRNFLRERAGLSS AA VQTRIGNSA 

ASRRSPAARPPVPAPPALPRGRPGTEGSTSLS 

APAVLWAVAWVVWSAVAWAMANYIHV 

PPG SPE VPKLNVT VQDQEEHRCREGALSLLQ 

KT.RPHWDPOFVTLOLFTDGITNKLIGCYVGN 

TMEDWLVRTYGNKTELLVDRDEEVKSFRVL 

QAHGCAPQLYCTFNNGLCYEFIQGEALDPKH 

VCNPAIFRLIARQLAKIHAIHAHNGWIPKSNL 

WLKMGKYFSLIPTGFADEDINKRFLSDIPSSQI 

LQEEMTWMK^ILSmGSPVVLCHNDLLCKNII 

YNEKQGDVQF1DYEYSGYNYLAYDIGKHFNE 

FAGVSDVDYSLYPDRELQSQWLRAYLEAYK 

EFKGFGTEVTEKEVEILFIQVNQFALASHFFW 

GLWALIQAKYST1EFDFLGYAIVRFNQYFKM 

KPEVTALKVPE 


1343 


2693 


A 


10394 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASA 

QDARYGQKDSSDQNFDYMFKLLIIGNSSVGK 

TSFLFRYADDSFTSAFVSTVGIDFKVKTVFKN 

EKRIKLQrWDTAGQERYRTTrTAYYRGAMGFI 

LMYDITNEESFNAVQDWSTQIKTYSWDNAQ 

VILVGNKCDMEDERVISTERGQHLGEQLGFE 

FFETSAKDNINVKQTFERLVDII CDKMSESLET 

DPAITAAKQNTRLKETPPPPQPNCAC 


1344 


2694 


A 


10395 


2 


4136 


DRPPWNSRVDDFVTNLtHLSSKGHISPAKDTS 
LQQRTPAEMSPVLHFYVRPSOHEGAASGHTR 
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NO: of 
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seq- 
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SEQ ID 
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seq- 
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hod 
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in 

USSN 
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Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-AIanine C=Cystcine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenyIa]anine, G=G lycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine I R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Ty rosin e, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V-possible 
nucleotide insertion 














RKLQGKLPELQGVETELCVNVNWTAEAJLPSA 

EETKKLMWLFGCPLLLDDVARESWLLPGSN 

DLLLEVGPRLNFSTPTSTNIVSVCRATGLGPV 

DRVETTRRYRLSFAHPPSAEVEAIALATLHDR 

MTEQHFPHPIQSFSPESMPEPLNGPINILGEGR 

LALEKANQELG LALDSWDLDFYTKRFQELQR 

NPSTVEAFDLAQSNSEHSRHWFFKGQLHVDG 

QKLVHSLFES1 M STQESSNPNNVLKFCDNS S A 

IQGKEVRFLRPEDPTRPSRFQQQQGLRHVVFT 

AETHNFPTGVCPFSGATTGTGGRIRDVQCTG 

RGAHVVAGTAGYCFGNLHIPGYNLPWEDLSF 

QYPGNFARPLEVAIEASNGASDYGNKFGEPV 

LAGFARSLGLQLPDGQRREWIKPIMFSGGIGS 

MEADHISKEAPEPGMEVVKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQ 

KMNRV1RACVEAPKGNPICSLHDQGAGGNG 

NVLKELSDPAGAIIYTSRFQLGDPTLNALEIW 

GAEYQESNALLLRSPNRDFLTHVSARERCPA 

CFVGTITGDRRIVLVDDRECPVRRNGQGDAP 

PTPPPTPVDLELEWVLGKMPRK£FFLQRKPP 

MLQPLALPPGLSVHQALERVLRLPAVASKRY 

LTNKVDRSVGGLVAQQQCVGPLQTPLADVA 

WALSHEELIGAATALGEQPVKSLLDPKVAA 

RLAVAEALTNLVFALVTDLRDVKCSGNWM 

WAAKLPGEGAALADACEAMVAVMAALGVA 

VDGGKDSLSMAARVGTETVRAPGSLVISAYA 

VCPDITATVTPDLKHPEGRGHLLYVALSPGQ 

HRLGGTALAQCFSQLGEHPPDLDLPENLVRA 

FSITQGLLKDRLLCSGHDVSDGGLVTCLLEM 

AFAGNCGLQVDVPVPRVDVLSVLFAEEPGLV 

LEVQEPDLAQVLKRYRDAGLHCLELGHTGE 

AGPHAMVRVSVNGAWLEEPVGELRALWEE 

T*?FOT nRI flAFPRPVAFFFRPiT RFRMfiPSYP 

LPPTFPKASVPREPGGPSPRVAILREEGSNGDR 

EMADAFHLAGFEVWDVTMQDLCSGAIGLDT 

FRGVAFVGGFSYADVLGSAKGWAAAVTFHP 

RAGAELRRFRKRPDTFSLGVCNGCQLLALLG 

WVGGDPNEDAAEMGPDSQPARPGLLLRHNL 

SGRYESRWASVRVGPGPALMLRGMEOAVLP 

VWSAHGEGYVAFSSPELOAOIEARGLAPLFTW 

ADDDGNPTEQYPLNPNGSPGGVAGICSCDGR 

HLAVMPHPERAVRPWQWAWRPPPFDTLTTS 

PWLQLFINARNWTLEGSC 


1345 


2695 


A 


10396 


65 


642 


GVRGFWAOTMASRAGPRAAGTDGSDFQHRE 

RVAMHYQMSVTLKYEIKKLIYVHLVIWLLLV 

AKMSVGHLRXLSHDQVAMPYQWEYPYLLSI 

LPSLLGLLSFPRNNISYLVLSMISMGLFSIAPLI 

YGSMEMFPAAQQLYRHGKAYRFLFGFSAVSt 

MYLVLVLAVQVHAWQLYYSKKLLDSWFTST 

QEKKHK 


1346 


2696 


A 


10398 


1 


718 


DDFVRCGPQSAAMGASARLLRAVIMGAPGS 

GKGTVSSRITTHFELKHLSSGDLJLRDNMLRGT 

EIGVLAKAFroQGKLIPDDVMTRLALHELKNL 

TQYSWLLDGFPRTLPQAEALDRAYQIDTVINL 

NVPFEV1KQRLTARWIHPASGRVYNIEFNPPK 

TVG1DDLTGEPLIQREDDKPETVIKRLKAYED 

QTKPVLEYYQKXGVLETFSGTETNKIWPYVY 

AFLQ IXVPQRSQKASVTP 


1347 


2697 


A 


10402 


153 


1969 


KHRQETWALDMAPEIHMTGPMCtlENTNGEL 
VANPEALKJLSAITQPVVWAIVGLYRTGKSY 
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seq- 
uence 


Met 

hod 
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in 
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Predicted 
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correspond i 
ng to first 
amino acid 
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peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A™ Alanine C-Cystcinc, 
D=Aspartic Acid, E^Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K— Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline T 
Q=Glutamine, R=Arginine, S^Serine, 
T=Threonme, V- Valine, W-Tryptophan, 
Y»Tyrosine, X=Unknown, *=Stopcodon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














LMNKLAGKKKGFSLGSTVKSHTKGIWMWCV 

PHPKKPEHTLVLLDTEGLGDVKKGDNQNDS 

W1FTL A VLL S STL VYN SM GTTN QQ AM DQ L YY 

VTELTHRIRSKS SPDENENEDS ADFVSFFPDF V 

WTLRDFSLDLEADGQPLTPDEYLEYSL1CLTQ 

GTSQKDKNFNLPRJLCIRKFFPKKKCFVFDLPI 

HRRXLAQLEKLQDEELDPEFVQQVADFCSYI 

FSNSKTKTLSGGIKVNGPRLESLVLTYINAISR 

GDLPCMENAVLALAQIENSAAVQKAIAHYD 

QQMGQKVQLPAETLQELLDLHRVSEREATEV 

YMKNSFKDVDHLFQKKLAAQLDKKRDDFCK 

QNQEASSDRCSALLQVIFSPLEEEVKAGIYSK 

PGGYCLFIQKLQDLEKKYYEEPRKGFQAEEIL 

QTYLKSKESVTDAILQTDQILTEKEKEIEVEC 

VKAESAQASAKMVEEMQDCYQQMMEEKEKS 

YQEHVKQLTEKMERERAQLLEEQEKTLTSKL 

QEQARVLKERCQGESTQLQNEIQKLQKTLKfC 

KTKRYMSHKLKI 


1348 


2698 


A 


10404 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAG 

VAGGAPRRRTPVTMWRLLARASAPLLRVPLS 

DS WALLP ASAG VKTLLPVPSFED VSIPEKPKL 

RFERAPLVPKVRREPKNLSDIRGPSTEATEFT 

EGNFA1LALGGGYLHWGHFEMMRXTINRSM 

DPKNMFAIWRVTAPFKPI I'RKSVGHPJvlGGGK 

GAIDHYVTPVKAGRLVVEMGGRCEFEEVQG 

FLDQVAHKLPFAAKAVSRGTLEKMRKDQEE 

RERNN QNP WTF ERJ AT AN ML G I RK VL S P Y D L 

THKGKYWGKFYMPKRV 


1349 


2699 


A 


10409 


59 


1184 


LRRNCSALGGLFQTIISDMKGSYPVWEDFINK 

AGKLQSQLRTTVVAAAAFLDAFQKVADMAT 

NTRGGTREIGSALTRMCMRHRSIEAKLRQFSS 

ALIDCLINPLQEQMEEWKJCVANQLDKDHAK 

EYKKARQEIKKKSSDTLKLQKKAKKGRGDIQ 

PQLDSALQDVNDKYLLLEETEKQAVRKALE 

ERGRFCTFIS MLRP VIEEEI SMLGEITHLQTISE 

DLKSLTMDPHKLPSSSEQVILDLKGSDYSWS 

YQTPPSSPSTTMSRKSSVC5SLNSVNSSDSRSS 

GSHSHSPSSHYRYRSSNLAQQAPVRLSSVSSH 

DSGFISQDAFQSKSPSPMPPEAPNQRRKEKRE 

PDPNGGGPTTASGPPAAAEFAQRPRSM 


1350 


2700 


A 


10410 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKST 
RGHSSLLPPSQDFVAGLSVILRGTVDDRLNW 
AFNLYDLNKDGCITKEEMLDIMKSrYDMMG 
KYTYPALREEAPREHVESFFQKMDRNKDGV 
VTIEEFIESCQKDEN IMR SMQLFDNVI 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-1350, a mature protein coding portion of SEQ ID NO: 1-1350, an 
active domain of SEQ ID NO: 1-1350, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3 . An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim I. 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-1350. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 
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13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim I ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 
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a) contacting the compound with the polypeptide of claim 10. in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected from 
the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1350, a mature protein 
coding portion of SEQ ID NO: 1-1350, an active domain of SEQ ID NO: 1-1350, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1.-1350, under conditions sufficient to express the polypeptide in said 
cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1351-2700, the mature protein portion thereof, or the active domain 
thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprises the sequence 
information of at least one of SEQ ID NO: 1-1350. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format 
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27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutical ly acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 



338 



WO 01/57188 



PCT/US01/03800 



Pages 340 to 1963 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 340 to 1963 de cette demande contiennent des listages des sequences 
d'acides amines. Elles peuvent etre obtenues a I'adresse indiquee ci-dessous. 
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